5 Essential Calculations For Accurate Flow Cytometry Results

Written by Tim Bushnell, PhD

Flow cytometry is a numbers game. There are percentages of a population, fluorescence intensity measurements, sample averages, data normalization, and more. Many of these common calculations are useful, but surrounded by misconceptions. This primer will help you decide which calculation to use, when to use it, and how to interpret the results.

1. Staining Index

The staining index (SI) is a way to measure the relative brightness of a fluorochrome and compare it to other fluorochromes in a biologically relevant manner.

The SI is useful for ranking fluorochrome brightness on your instrument of choice. It is also a useful tool for evaluating titration data.

SI is a relative number, so it is best to focus on comparisons, and not the absolute value.

In the case of making a decision as to which fluorochrome is brighter than another, there are sites like this one at Biolegend, and this one at BD, that give a relative rank based on a standard analysis. These are useful, but if your system is significantly different from the standard, you may benefit from performing the experiments yourself.

SI was first reported in a Bigos cytometry abstract, and popularized by Maecker et al. The initial concept for the SI is a way to compare and rank different fluorochromes to help researchers make decisions as to the relative brightness of these different fluorochromes, as shown below (Figure 1).

Figure 1. Schematic and formula for the Staining Index.

Briefly, the distance is the difference between the mean (in the classical definition) of the positive minus the central tendency of the negative. This is divided by twice the spread of the negatives, as measured by the standard deviation.

The SI has several uses. Most notably is the generation of the staining index chart, such as the one shown in Table 1. By calculating the relative brightness of each fluorochrome, you get a tool to help you decide which fluorochromes to use during panel building.

In Table 1, LSR-12A is a 3-laser (405, 488, and 633 nm) system, while LSR-18A is a 4-laser (405, 488, 532, and 633 nm) system.

Table 1: Staining index comparing two different instruments.

Notice the differences in the relative rankings. Some are easy to explain. For example, AF532 is relatively brighter on LSR-18A than on LSR-12A, due to the presence of the 532 laser.

Other differences are related to sensitivity and background on these different instruments. For example, APC (which is considered relatively bright) is not as bright as the FITC signal on either instrument. The background fluorescence and spread on these two machines drive this observation.

Another use of the SI is in titration data to identify the best concentration. While it is often done by eye, plotting the data improves visualization of the calculation (Figure 2).

Figure 2: Titration data, using staining index vs. concentration.

The staining index is a useful calculation and should be in your flow cytometry toolkit.

As a note, Telford and co-workers published a variation of the staining index.

Having run these equations side-by-side, I have yet to see a difference, so choose the one you are most comfortable with and use it.

2. Data Normalization

Sometimes, data needs to be normalized. This process typically involves identifying an appropriate control population and dividing the experimental by the control.

A simple fold over background calculation could be as easy as % positive/% control cells to yield a single metric that can be taken to statistical analysis.

For expression-based calculations, use of the resolution metric (R_D) is recommended.

This metric is based on Fisher’s Discriminant Ratio. Using this equation, the difference between two populations is measured and corrected for by the sum of the standard deviations. The base formula is shown here:

Using this formula, it is possible to convert measurements taken on different days to a single, unitless number that is better suited for comparisons. This calculation has been used in genomics analysis for a while, and is becoming common in flow cytometry as well.

3. Statistical calculations

There are a variety of statistical tools that you will need to use in summarizing your data and evaluating the hypothesis that the experiments were designed to test. At the end of the day, there are a bunch of numbers that have to be properly analyzed.

If the experimental plan was properly laid out, the statistical analytical methods have already been laid out as well.

The practical math behind each of the different methods is not something to worry about in this post. There are great software packages out there that can do the calculations for you.

However, it is important to be aware of several things when discussing statistical calculations.

Choose the right test — You need to choose the correct statistical test based on what you are attempting to prove. These could be t-Tests, ANOVA, linear regression, or one of another handful of tests based on the distribution of the data and the comparisons being made.
Set the proper threshold — The α value is the threshold that will be used to determine if your data meets the criteria to reject the null hypothesis. If the calculated P value is less than the threshold, the experiments are considered statistically significant (you reject the null hypothesis). If the P value is greater than the threshold, you cannot reject the null hypothesis.Of course, it is important to remember that the question the experiment was designed to test is important and biologically relevant. There are cases where significance is found, but the question was scientifically trivial. Make sure to state the hypothesis at the beginning and follow through to the end.
Collect enough samples — You don’t want to get into a discussion with a biostatician like these two fellows — the power calculation can assist you in determining the number of samples you should be collecting to properly analyze your experiments.

4. Sorting Calculations

Cell sorting is a powerful tool in isolating interesting cells from the background cells in the system.

From a simple GFP+ sort, to a complex multicolor panel to isolate a rare circulating tumor cell, there is a lot of math behind sorting, and not just for getting the system to work. Here are some calculations that will help you answer the most common sorting questions.

How fast can I sort?

Once you realize that cell sorters are sorting droplets of liquid, things start to become a bit easier. With electrostatic cell sorters, the goal is to have one cell in one droplet, and no cells in the surrounding droplets.

This process is governed by Poisson statistics.

As shown in this figure from Rui Gardner, head of Flow Cytometry at Memorial Sloan Kettering, having 1 cell every 4 drops gives you a reasonable probability for having no cells in the leading or lagging drop.

So, the simple calculation of drop drive frequency divided by 4 will give you the maximum event rate you should strive for on the sorter.

How many cells do I need to start with? How long will it take?

Starting with the required number of cells for the downstream application, it is possible to approximate how many cells to start with so that you will end up with enough cells in the end. At the same time, we can estimate how long the run should take, barring unforeseen circumstances.

Total cells needed / (frequency of population * sort efficiency) = starting populationI like to double the starting population to account for losses in the processing process.
Starting population/max events/second = time of sort (sec)

How pure is my sample? What is my post-sort recovery?

Here are three common values that can be used to characterize a sort.

Armed with this information, it is possible for you to figure out how long your sort will take, so you can plan accordingly. You can also observe how good the sort was and if you have enough cells for your downstream application.

5. Compensation

No post about flow cytometry calculations could be complete without touching on the most fundamental of calculations in flow cytometry — the calculation of the compensation matrix.

The comp matrix is essential for good flow cytometry, so that the spectral overlap from a given fluorochrome into a secondary channel is properly accounted for to ensure that it is possible to identify true signal.

As a reminder, those 3 rules are:

The compensation sample should be at least as bright as the experimental samples to which the compensation will be applied.
The backgrounds of the negative and carrier must be matched (no universal negative; cells-to-cells, beads-to-beads).
The compensation color must be matched to the experimental color.
1. Same fluorochrome (FITC ≠ A488, tandems must be from exact same stock).
2. Same sensitivity (don’t change voltage between tubes).

And, as always, collect enough events. Following these rules will ensure you have consistent and correct compensation.

If you have not yet integrated these calculations into your workflow, consider where each would be useful. Some, like the SI, are very useful in the development of new panels — from titration to voltration, it makes the comparison of the different samples easy. While it is possible to plot the data and try to gauge by eye, having a number there is much easier to make a decision. When preparing for a sort, it is vital to do these calculations, even for a ballpark of how many cells you might need to start with.

Compensation is, of course, one of the most critical calculations, so make sure you provide the correct controls that meet the “3 Rules”, and let the software do the work. In the end, doing these calculations should help you with your work, as it will improve consistency and reproducibility, and ensure you have sufficient cells for your downstream applications.

To learn more about the 5 Essential Calculations For Accurate Flow Cytometry Results, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.

ABOUT TIM BUSHNELL, PHD

Tim Bushnell holds a PhD in Biology from the Rensselaer Polytechnic Institute. He is a co-founder of—and didactic mind behind—ExCyte, the world’s leading flow cytometry training company, which organization boasts a veritable library of in-the-lab resources on sequencing, microscopy, and related topics in the life sciences.

More Written by Tim Bushnell, PhD

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

By: Tim Bushnell, PhD

Numbers are all around us. My personal favorite is ≅1.618 aka ɸ aka ‘the golden ratio’. It’s found throughout history, where it has influenced architects and artists. We see it in nature, in plants, and it is used in movies to frame shots. It can be approximated by the Fibonacci sequence (another math favorite of mine). However, I have not worked out how to apply this to flow cytometry. That doesn’t mean numbers aren’t important in flow cytometry. They are central to everything we do, and in this blog, I’m going to flit around numbers-based questions that I have received…

Read Article

How To Do Variant Calling From RNASeq NGS Data

By: Deepak Kumar, PhD

Developing variant calling and analysis pipelines for NGS sequenced data have become a norm in clinical labs. These pipelines include a strategic integration of several tools and techniques to identify molecular and structural variants. That eventually helps in the apt variant annotation and interpretation. This blog will delve into the concepts and intricacies of developing a “variant calling” pipeline using GATK. “Variant calling” can also be performed using tools other than GATK, such as FREEBAYES and SAMTOOLS. In this blog, I will walk you through variant calling methods on Illumina germline RNASeq data. In the steps, wherever required, I will…

Read Article

Understanding Clinical Trials And Drug Development As A Research Scientist

By: Deepak Kumar, PhD

Clinical trials are studies designed to test the novel methods of diagnosing and treating health conditions – by observing the outcomes of human subjects under experimental conditions. These are interventional studies that are performed under stringent clinical laboratory settings. Contrariwise, non-interventional studies are performed outside the clinical trial settings that provide researchers an opportunity to monitor the effect of drugs in real-life situations. Non-interventional trials are also termed observational studies as they include post-marketing surveillance studies (PMS) and post-authorization safety studies (PASS). Clinical trials are preferred for testing newly developed drugs since interventional studies are conducted in a highly monitored…

Read Article

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

By: Deepak Kumar, PhD

In the first blog of this series, we explored the power of sequencing the genome at various levels. We also dealt with how the characterization of the RNA expression levels helps us to understand the changes at the genome level. These changes impact the downstream expression of the target genes. In this blog, we will explore how NGS sequencing can help us comprehend DNA modification that affect the expression pattern of the given genes (epigenetic profiling) as well as characterizing the DNA-protein interactions that allow for the identification of genes that may be regulated by a given protein. DNA Methylation Profiling…

Read Article

How To Profile DNA And RNA Expression Using Next Generation Sequencing

By: Deepak Kumar, PhD

Why is Next Generation Sequencing so powerful to explore and answer both clinical and research questions. With the ability to sequence whole genomes, identifying novel changes between individuals, to exploring what RNA sequences are being expressed, or to examine DNA modifications and protein-DNA interactions occurring that can help researchers better understand the complex regulation of transcription. This, in turn, allows them to characterize changes during different disease states, which can suggest a way to treat said disease. Over the next two blogs, I will highlight these different methods along with illustrating how these can help clinical diagnostics as well as…

Read Article

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

By: Deepak Kumar, PhD

NGS methodologies have been used to produce high-throughput sequence data. These data with appropriate computational analyses facilitate variant identification and prove to be extremely valuable in pharmaceutical industries and clinical practice for developing drug molecules inhibiting disease progression. Thus, by providing a comprehensive profile of an individual’s variome — particularly that of clinical relevance consisting of pathogenic variants — NGS helps in determining new disease genes. The information thus obtained on genetic variations and the target disease genes can be used by the Pharma companies to develop drugs impeding these variants and their disease-causing effect. However simple this may allude…

Read Article

7 Key Image Analysis Terms For New Microscopist

By: Heather Brown-Harding, PhD

As scientists, we need to perform image analysis after we’ve acquired images in the microscope, otherwise, we have just a pretty picture and not data. The vocabulary for image processing and analysis can be a little intimidating to those new to the field. Therefore, in this blog, I’m going to break down 7 terms that are key when post-processing of images. 1. RGB Image Images acquired during microscopy can be grouped into two main categories. Either monochrome (that can be multichannel) or “RGB.” RGB stands for red, green, blue – the primary colors of light. The cameras in our phones…

Read Article

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

By: Tim Bushnell, PhD

In the flow cytometry community, SPADE (Spanning-tree Progression Analysis of Density-normalized Events) is a favored algorithm for dealing with highly multidimensional or otherwise complex datasets. Like tSNE, SPADE extracts information across events in your data unsupervised and presents the result in a unique visual format. Given the growing popularity of this kind of algorithm for dealing with complex datasets, we decided to test the SPADE algorithm in 5 software packages, including Cytobank, FCS Express, FlowJo, R, and the original, free software made available by the author of SPADE. Which was the fastest?

Read Article

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

By: Tim Bushnell, PhD

FlowJo is a powerful tool for performing and analyzing flow cytometry experiments, if you know how to use it to the fullest. This includes understanding embedding and using keywords, the FlowJo compensation wizard, spillover spreading matrix, FlowJo and R, and creating tables in FlowJo. Extending your use of FJ using these hacks will help organize your data, improve analysis and make your exported data easier to understand and explain to others. Take a few moments and explore all you can do with FJ beyond just gating populations.

Read Article

See More Articles

Top Industry Career eBooks

Get the Advanced Microscopy eBook

Heather Brown-Harding, PhD

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.

Learn More

Get The Free Modern Flow Cytometry eBook

Tim Bushnell, PhD

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more.

Learn More

Get The Free 4-10 Compensation eBook

Tim Bushnell, PhD

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.

Learn More

See All eBooks

5 Essential Calculations For Accurate Flow Cytometry Results

1. Staining Index

2. Data Normalization

3. Statistical calculations

4. Sorting Calculations

5. Compensation

ABOUT TIM BUSHNELL, PHD

Similar Articles

By: Tim Bushnell, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Deepak Kumar, PhD

By: Heather Brown-Harding, PhD

By: Tim Bushnell, PhD

By: Tim Bushnell, PhD

Top Industry Career eBooks

Heather Brown-Harding, PhD

Tim Bushnell, PhD

Tim Bushnell, PhD