5 Essential Calculations For Accurate Flow Cytometry Results
Flow cytometry is a numbers game. There are percentages of a population, fluorescence intensity measurements, sample averages, data normalization, and more. Many of these common calculations are useful, but surrounded by misconceptions. This primer will help you decide which calculation to use, when to use it, and how to interpret the results.
1. Staining Index
The staining index (SI) is a way to measure the relative brightness of a fluorochrome and compare it to other fluorochromes in a biologically relevant manner.
The SI is useful for ranking fluorochrome brightness on your instrument of choice. It is also a useful tool for evaluating titration data.
SI is a relative number, so it is best to focus on comparisons, and not the absolute value.
In the case of making a decision as to which fluorochrome is brighter than another, there are sites like this one at Biolegend, and this one at BD, that give a relative rank based on a standard analysis. These are useful, but if your system is significantly different from the standard, you may benefit from performing the experiments yourself.
SI was first reported in a Bigos cytometry abstract, and popularized by Maecker et al. The initial concept for the SI is a way to compare and rank different fluorochromes to help researchers make decisions as to the relative brightness of these different fluorochromes, as shown below (Figure 1).
Figure 1. Schematic and formula for the Staining Index.
Briefly, the distance is the difference between the mean (in the classical definition) of the positive minus the central tendency of the negative. This is divided by twice the spread of the negatives, as measured by the standard deviation.
The SI has several uses. Most notably is the generation of the staining index chart, such as the one shown in Table 1. By calculating the relative brightness of each fluorochrome, you get a tool to help you decide which fluorochromes to use during panel building.
In Table 1, LSR-12A is a 3-laser (405, 488, and 633 nm) system, while LSR-18A is a 4-laser (405, 488, 532, and 633 nm) system.
Table 1: Staining index comparing two different instruments.
Notice the differences in the relative rankings. Some are easy to explain. For example, AF532 is relatively brighter on LSR-18A than on LSR-12A, due to the presence of the 532 laser.
Other differences are related to sensitivity and background on these different instruments. For example, APC (which is considered relatively bright) is not as bright as the FITC signal on either instrument. The background fluorescence and spread on these two machines drive this observation.
Another use of the SI is in titration data to identify the best concentration. While it is often done by eye, plotting the data improves visualization of the calculation (Figure 2).
Figure 2: Titration data, using staining index vs. concentration.
The staining index is a useful calculation and should be in your flow cytometry toolkit.
As a note, Telford and co-workers published a variation of the staining index.
Having run these equations side-by-side, I have yet to see a difference, so choose the one you are most comfortable with and use it.
2. Data Normalization
Sometimes, data needs to be normalized. This process typically involves identifying an appropriate control population and dividing the experimental by the control.
A simple fold over background calculation could be as easy as % positive/% control cells to yield a single metric that can be taken to statistical analysis.
For expression-based calculations, use of the resolution metric (RD) is recommended.
This metric is based on Fisher’s Discriminant Ratio. Using this equation, the difference between two populations is measured and corrected for by the sum of the standard deviations. The base formula is shown here:
Using this formula, it is possible to convert measurements taken on different days to a single, unitless number that is better suited for comparisons. This calculation has been used in genomics analysis for a while, and is becoming common in flow cytometry as well.
3. Statistical calculations
There are a variety of statistical tools that you will need to use in summarizing your data and evaluating the hypothesis that the experiments were designed to test. At the end of the day, there are a bunch of numbers that have to be properly analyzed.
If the experimental plan was properly laid out, the statistical analytical methods have already been laid out as well.
The practical math behind each of the different methods is not something to worry about in this post. There are great software packages out there that can do the calculations for you.
However, it is important to be aware of several things when discussing statistical calculations.
- Choose the right test — You need to choose the correct statistical test based on what you are attempting to prove. These could be t-Tests, ANOVA, linear regression, or one of another handful of tests based on the distribution of the data and the comparisons being made.
- Set the proper threshold — The α value is the threshold that will be used to determine if your data meets the criteria to reject the null hypothesis. If the calculated P value is less than the threshold, the experiments are considered statistically significant (you reject the null hypothesis). If the P value is greater than the threshold, you cannot reject the null hypothesis.Of course, it is important to remember that the question the experiment was designed to test is important and biologically relevant. There are cases where significance is found, but the question was scientifically trivial. Make sure to state the hypothesis at the beginning and follow through to the end.
- Collect enough samples — You don’t want to get into a discussion with a biostatician like these two fellows — the power calculation can assist you in determining the number of samples you should be collecting to properly analyze your experiments.
4. Sorting Calculations
Cell sorting is a powerful tool in isolating interesting cells from the background cells in the system.
From a simple GFP+ sort, to a complex multicolor panel to isolate a rare circulating tumor cell, there is a lot of math behind sorting, and not just for getting the system to work. Here are some calculations that will help you answer the most common sorting questions.
How fast can I sort?
Once you realize that cell sorters are sorting droplets of liquid, things start to become a bit easier. With electrostatic cell sorters, the goal is to have one cell in one droplet, and no cells in the surrounding droplets.
This process is governed by Poisson statistics.
As shown in this figure from Rui Gardner, head of Flow Cytometry at Memorial Sloan Kettering, having 1 cell every 4 drops gives you a reasonable probability for having no cells in the leading or lagging drop.
So, the simple calculation of drop drive frequency divided by 4 will give you the maximum event rate you should strive for on the sorter.
How many cells do I need to start with? How long will it take?
Starting with the required number of cells for the downstream application, it is possible to approximate how many cells to start with so that you will end up with enough cells in the end. At the same time, we can estimate how long the run should take, barring unforeseen circumstances.
- Total cells needed / (frequency of population * sort efficiency) = starting populationI like to double the starting population to account for losses in the processing process.
- Starting population/max events/second = time of sort (sec)
How pure is my sample? What is my post-sort recovery?
Here are three common values that can be used to characterize a sort.
Armed with this information, it is possible for you to figure out how long your sort will take, so you can plan accordingly. You can also observe how good the sort was and if you have enough cells for your downstream application.
No post about flow cytometry calculations could be complete without touching on the most fundamental of calculations in flow cytometry — the calculation of the compensation matrix.
The comp matrix is essential for good flow cytometry, so that the spectral overlap from a given fluorochrome into a secondary channel is properly accounted for to ensure that it is possible to identify true signal.
As a reminder, those 3 rules are:
- The compensation sample should be at least as bright as the experimental samples to which the compensation will be applied.
- The backgrounds of the negative and carrier must be matched (no universal negative; cells-to-cells, beads-to-beads).
- The compensation color must be matched to the experimental color.
- Same fluorochrome (FITC ≠ A488, tandems must be from exact same stock).
- Same sensitivity (don’t change voltage between tubes).
And, as always, collect enough events. Following these rules will ensure you have consistent and correct compensation.
If you have not yet integrated these calculations into your workflow, consider where each would be useful. Some, like the SI, are very useful in the development of new panels — from titration to voltration, it makes the comparison of the different samples easy. While it is possible to plot the data and try to gauge by eye, having a number there is much easier to make a decision. When preparing for a sort, it is vital to do these calculations, even for a ballpark of how many cells you might need to start with.
Compensation is, of course, one of the most critical calculations, so make sure you provide the correct controls that meet the “3 Rules”, and let the software do the work. In the end, doing these calculations should help you with your work, as it will improve consistency and reproducibility, and ensure you have sufficient cells for your downstream applications.
To learn more about the 5 Essential Calculations For Accurate Flow Cytometry Results, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.