Michael Reiter

Papers from this author

Detecting Rare Cell Populations in Flow Cytometry Data Using UMAP

Lisa Weijler, Markus Diem, Michael Reiter

Responsive image

Auto-TLDR; Unsupervised Manifold Approximation and Projection for Small Cell Population Detection in Flow cytometry Data

Slides Poster Similar

We present an approach for detecting small cell populations in flow cytometry (FCM) samples based on the combination of unsupervised manifold embedding and supervised random forest classification. Each sample consists of hundred thousands to a few million cells where each cell typically corresponds to a measurement vector with 10 to 50 dimensions. The difficulty of the task is that clusters of measurement vectors formed in the data space according to standard clustering criteria often do not correspond to biologically meaningful sub-populations of cells, due to strong variations in shape and size of their distributions. In many cases the relevant population consists of less than 100 scattered events out of millions of events, where supervised approaches perform better than unsupervised clustering. The aim of this paper is to demonstrate that the performance of the standard supervised classifier can be improved significantly by combining it with a preceding unsupervised learning step involving the Uniform Manifold Approximation and Projection (UMAP). We present an experimental evaluation on FCM data from children suffering from Acute Lymphoblastic Leukemia (ALL) showing that the improvement particularly occurs in difficult samples where the size of the relevant population of leukemic cells is low in relation to other sub-populations. Further, the experiments indicate that on such samples the algorithm also outperforms other baseline methods based on Gaussian Mixture Models.