Skip to content

Five Important Tips For Analyzing Your Data

Written by Tim Bushnell, PhD

Depending on the experimental design, many researchers will be doing complex assays that will require statistical analysis to determine if the hypothesis being tested is statistically significant or not. Unfortunately, many researchers go about this analysis the wrong way, resulting in spurious conclusions. The following points are guides to help think about the steps necessary in flow cytometry data analysis.

1. Before you start

Define your hypothesis. This may sound simplistic, but understanding the purpose of the experiments is the first step in performing good statistical analysis. Stating the hypothesis will allow the researcher to choose the correct statistical test BEFORE the experiments are begun, and more importantly define the null hypothesis. The null hypothesis is what is assumes to be true until evidence shows it otherwise.

For example, if the experimental question is “does treatment of patients with drug X increases the number of mature B cells in circulation”, then the null hypothesis would be that “treatment of patients with drug X causes no change to the number of mature B cells in circulation.” With the null hypothesis in place, the type of test to be performed is obvious. In this case, a T-Test would be the logical choice for testing the null hypothesis.

2. Set your threshold.

The threshold value is an estimate of the probably that the result has occurred by a statistical accident – that is at random. It is accepted in science to set this at 0.05, which is interpreted as a 5% chance that the significant results occurred by an accident. There is no magic to 0.05, more of an accepted convenience first proposed by R.A. Fisher. Many scientists are moving to use a threshold of 0.01 or even 0.001 indicating a smaller change of significance being a result of an accident.

3. Know the numbers.

All populations can be described by two numbers, the central tendency and the spread of the data. Depending on the type of data being looked at, different measures should be used. In the case of expression data, the fluorescent intensity is best represented by the median value. It represents the midpoint of the data and is robust because one does not need to know the complete dataset, it does not assume a normal distribution and is resistant to outliers. In the case of the question above, the change in the percentage of a sub-population of cells, the mean value is a better choice, assuming the data follows a normal distribution.

Measuring distribution or spread of the data is done using the standard deviation. The smaller the SD, the tighter the data is clustered around the mean. In the case of the median, the robust SD (rSD) is one of several measures used to describe the deviation around the median. Another measure of deviation around the median is the MAD – median absolute deviation. The MAD is the median of the absolute value of the deviations from the median.

Confused? Don’t worry how the statistic is calculate, the software can do that for you. It is easy to remember Mean/SD and Median/rSD as a simple way to know what values should be used together.

4. Can’t forget controls.

Controls in flow are essential at many levels. These include the FMO control (for gating) to the unstimulated control (for setting background in stimulation experiments), to the reference control (used to ensure the experiment is reproducible and identify the biological variation in the experiment). Additionally, make sure that the correct control is being used for statistical analysis. The FMO control, for example, is not a control to use to identify the negatives in a statistical analysis. That role should be played by fully gated known negatives or background (unstimulated) cells.

5. Make sure you perform enough replicates.

This youtube video has made the rounds (http://www.youtube.com/watch?v=PbODigCZqL8) and is something that one needs to be careful of. Don’t use just 3 patients! To ensure enough replicates are performed, consider performing an a priori power analysis. In this analysis, an estimate of the differences between the Treatment and Control are made and the sample size needed to detect that difference is determined. The power of a statistical test is important in reducing Type II errors (the false negative). To improve statistical power in a test, consider adding more samples. The larger the sample size, the more power the statistical test will have.

Statistical analysis is a powerful tool in flow cytometry and should be considered as part of the initial experimental design, rather than at the end with the data are completely collected. Identifying the proper null hypothesis will lead to identifying the correct statistical test. Setting the proper threshold, rather than running the test and seeing what the returned P-value is an essential way to ensure the significance of the data is properly measured and understood. Finally, collect enough events and enough patient samples to ensure adequate power and minimize the change of a false negative error.

Tim Bushnell, PhD

BOOKS

Advanced Microscopy

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.
flow cytometry tablet eBook cover

Modern Flow Cytometry

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more. 

Advanced 4-10 Color Compensation

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.

Top 40 Networking Scripts For PhDs

If you want to get replies from top employers and recruiters, this ebook is for you. These networking scripts will show you the exact words ...

Informational Interviews For PhDs

If you want to learn how to set up and execute informational interviews with PhDs working in industry, this ebook is for you. Here, you ...

Industry Resume Guide For PhDs

If you have a PhD and want to create the perfect industry resume to attract employers, this ebook is for you. Here, you will get ...

Top 20 Industry Jobs For PhDs

If you want to learn about the top 20 industry careers for PhDs regardless of your PhD background, this ebook is for you. Here, you ...

Salary Negotiation For PhDs

If you have a PhD and want to learn advanced salary negotiation strategies, this ebook is for you. Here, you will learn how to set ...

Top 20 Transferable Skills For PhDs

If you want to learn the top 20 transferable skills the industry employers ranked as most important for PhDs to include on their resumes and ...

Related Posts You Might Like

We Tested 5 Major Flow Cytometry SPADE Programs for Speed – Here Are The Results

Written By: Tim Bushnell, PhD As a follow-up to our post on tSNE where we compared the speed of calculation in leading software packages, let’s ...
Read More

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

Written By: Tim Bushnell, PhD Primary data analysis, that is the analysis at the sample or tube level, is where the populations of interest are ...
Read More

Statistical Challenges Of Rare Event Measurements In Flow Cytometry

Written by Tim Bushnell, PhD To conclude our series on rare event analysis, it is time to discuss the statistics behind rare event analysis. The ...
Read More

How to Optimize Flow Cytometry Hardware For Rare Event Analysis

Written by Tim Bushnell, PhD “Not everything that can be counted counts and not everything that counts can be counted.” — William Bruce Cameron (but ...
Read More

How To Choose The Correct Antibody For Accurate Flow Cytometry Results

Written by Tim Bushnell, PhD Next to the flow cytometer itself, the most important component of a flow cytometry experiment comes down to the antibodies. ...
Read More

How To Achieve Accurate Flow Cytometry Calcium Flux Measurements

Written by Tim Bushnell, PhD Most flow cytometry experiments work with antibodies conjugated to a fluorochrome for some variation on immunophenotyping. However, any fluorochrome that ...
Read More