Flow Cytometry Statistics

Written by Tim Bushnell, PhD

Understanding statistics and fow cytometry statistical analysis is critical to understanding flow cytometry data.

One of the powers of flow cytometry is the fact that we generate large amounts of data that are amenable to statistical analysis of our populations of interest. Using the standard set of statistical analysis tools allows for hypothesis testing and ultimately determining if there is statistical significance in the datasets.

There are two basic classes of questions that are typically asked in flow cytometry. The first class relate to changes in the number or percent of a specific population upon treatment or disease state. A hypothesis in this class might look like this:

Case 1: In patients suffering from Bowden’s Malady, treatment with Pescaline D causes no change in the percentage of CD86+ memory T cells.

The second class of questions asked in flow cytometry relate to the changes in expression of a given antigen upon treatment or disease state. A hypothesis in this class might be phrased as:

Case 2: In patients suffering from Bowden’s Malady, treatment with Pescaline D causes no change to the expression Interferon gamma on CD86+ memory T cells .

Once the question is determined, an appropriate experimental would be performed, with sufficient replicates (as determined by a power calculation), the correct data can be properly extracted for statistical analysis.

In Case 1, the data would be the percent of CD86+ memory T cells in patients with Bowden’s Malady +/- treatment. This data would be compared using a T-test to determine significance. To perform the T-test, the investigator would need to define the threshold (the a value), and calculate the P value.

When P <a – reject the null hypothesis and the difference is ‘statistically significant’

When P>Y – can’t reject the null hypothesis, and the difference is ‘not statistically significant’

In Case 2, the data that needs to be extracted is the central tendency of the expression of Interferon gamma on the CD86+ memory T cells. This is best represented as the Median Fluorescent Intensity (MFI). Additionally, the robust Standard Deviation (rSD) should be calculated, as it measures the spread of the data around the Median.

Before you move to hypothesis testing, it is often best to convert this data to a fold over background, or resolution metric (R_D)value. This is especially important when performing multiple experiments.

The R_D is better as it accounts for the spread of the data, not just the separation between experimental and control.

R_D = Median_exp – Median_ctl

rSD_exp + rSD_ctl

Once the R_Dis calculated, you can move to hypothesis testing using a T Test against a hypothetical mean. In this case, the hypothetical mean would be 0. Again, the investigator would need to define the threshold (the a value), and calculate the P value.

The caveat for the T-Test is that the data follows a Gaussian distribution. If you do not have Gaussian distributed data, there are similar non-parametric tests that can be performed. They will result a P value being reported and identification of statistical significance.

These basic pair-wise comparison tests allow for determination of statistical significance in two populations. If you have more than two populations, or more complex questions, there are additional statistical tools that can be used, such as regression analysis and ANOVA analysis.

ABOUT TIM BUSHNELL, PHD

Tim Bushnell holds a PhD in Biology from the Rensselaer Polytechnic Institute. He is a co-founder of—and didactic mind behind—ExCyte, the world’s leading flow cytometry training company, which organization boasts a veritable library of in-the-lab resources on sequencing, microscopy, and related topics in the life sciences.

More Written by Tim Bushnell, PhD

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

By: Tim Bushnell, PhD

Numbers are all around us. My personal favorite is ≅1.618 aka ɸ aka ‘the golden ratio’. It’s found throughout history, where it has influenced architects and artists. We see it in nature, in plants, and it is used in movies to frame shots. It can be approximated by the Fibonacci sequence (another math favorite of mine). However, I have not worked out how to apply this to flow cytometry. That doesn’t mean numbers aren’t important in flow cytometry. They are central to everything we do, and in this blog, I’m going to flit around numbers-based questions that I have received…

Read Article

How To Do Variant Calling From RNASeq NGS Data

By: Deepak Kumar, PhD

Developing variant calling and analysis pipelines for NGS sequenced data have become a norm in clinical labs. These pipelines include a strategic integration of several tools and techniques to identify molecular and structural variants. That eventually helps in the apt variant annotation and interpretation. This blog will delve into the concepts and intricacies of developing a “variant calling” pipeline using GATK. “Variant calling” can also be performed using tools other than GATK, such as FREEBAYES and SAMTOOLS. In this blog, I will walk you through variant calling methods on Illumina germline RNASeq data. In the steps, wherever required, I will…

Read Article

Understanding Clinical Trials And Drug Development As A Research Scientist

By: Deepak Kumar, PhD

Clinical trials are studies designed to test the novel methods of diagnosing and treating health conditions – by observing the outcomes of human subjects under experimental conditions. These are interventional studies that are performed under stringent clinical laboratory settings. Contrariwise, non-interventional studies are performed outside the clinical trial settings that provide researchers an opportunity to monitor the effect of drugs in real-life situations. Non-interventional trials are also termed observational studies as they include post-marketing surveillance studies (PMS) and post-authorization safety studies (PASS). Clinical trials are preferred for testing newly developed drugs since interventional studies are conducted in a highly monitored…

Read Article

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

By: Deepak Kumar, PhD

In the first blog of this series, we explored the power of sequencing the genome at various levels. We also dealt with how the characterization of the RNA expression levels helps us to understand the changes at the genome level. These changes impact the downstream expression of the target genes. In this blog, we will explore how NGS sequencing can help us comprehend DNA modification that affect the expression pattern of the given genes (epigenetic profiling) as well as characterizing the DNA-protein interactions that allow for the identification of genes that may be regulated by a given protein. DNA Methylation Profiling…

Read Article

How To Profile DNA And RNA Expression Using Next Generation Sequencing

By: Deepak Kumar, PhD

Why is Next Generation Sequencing so powerful to explore and answer both clinical and research questions. With the ability to sequence whole genomes, identifying novel changes between individuals, to exploring what RNA sequences are being expressed, or to examine DNA modifications and protein-DNA interactions occurring that can help researchers better understand the complex regulation of transcription. This, in turn, allows them to characterize changes during different disease states, which can suggest a way to treat said disease. Over the next two blogs, I will highlight these different methods along with illustrating how these can help clinical diagnostics as well as…

Read Article

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

By: Deepak Kumar, PhD

NGS methodologies have been used to produce high-throughput sequence data. These data with appropriate computational analyses facilitate variant identification and prove to be extremely valuable in pharmaceutical industries and clinical practice for developing drug molecules inhibiting disease progression. Thus, by providing a comprehensive profile of an individual’s variome — particularly that of clinical relevance consisting of pathogenic variants — NGS helps in determining new disease genes. The information thus obtained on genetic variations and the target disease genes can be used by the Pharma companies to develop drugs impeding these variants and their disease-causing effect. However simple this may allude…

Read Article

7 Key Image Analysis Terms For New Microscopist

By: Heather Brown-Harding, PhD

As scientists, we need to perform image analysis after we’ve acquired images in the microscope, otherwise, we have just a pretty picture and not data. The vocabulary for image processing and analysis can be a little intimidating to those new to the field. Therefore, in this blog, I’m going to break down 7 terms that are key when post-processing of images. 1. RGB Image Images acquired during microscopy can be grouped into two main categories. Either monochrome (that can be multichannel) or “RGB.” RGB stands for red, green, blue – the primary colors of light. The cameras in our phones…

Read Article

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

By: Tim Bushnell, PhD

In the flow cytometry community, SPADE (Spanning-tree Progression Analysis of Density-normalized Events) is a favored algorithm for dealing with highly multidimensional or otherwise complex datasets. Like tSNE, SPADE extracts information across events in your data unsupervised and presents the result in a unique visual format. Given the growing popularity of this kind of algorithm for dealing with complex datasets, we decided to test the SPADE algorithm in 5 software packages, including Cytobank, FCS Express, FlowJo, R, and the original, free software made available by the author of SPADE. Which was the fastest?

Read Article

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

By: Tim Bushnell, PhD

FlowJo is a powerful tool for performing and analyzing flow cytometry experiments, if you know how to use it to the fullest. This includes understanding embedding and using keywords, the FlowJo compensation wizard, spillover spreading matrix, FlowJo and R, and creating tables in FlowJo. Extending your use of FJ using these hacks will help organize your data, improve analysis and make your exported data easier to understand and explain to others. Take a few moments and explore all you can do with FJ beyond just gating populations.

Read Article

See More Articles

Top Industry Career eBooks

Get the Advanced Microscopy eBook

Heather Brown-Harding, PhD

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.

Learn More

Get The Free Modern Flow Cytometry eBook

Tim Bushnell, PhD

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more.

Learn More

Get The Free 4-10 Compensation eBook

Tim Bushnell, PhD

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.

Learn More

See All eBooks

Flow Cytometry Statistics

ABOUT TIM BUSHNELL, PHD

Similar Articles

By: Tim Bushnell, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Heather Brown-Harding, PhD

By: Tim Bushnell, PhD

By: Tim Bushnell, PhD

Top Industry Career eBooks

Heather Brown-Harding, PhD

Tim Bushnell, PhD

Tim Bushnell, PhD