How To Analyze FACS Data And Prepare Flow Cytometry Figures For Scientific Papers

“It would be possible to describe everything scientifically, but it would make no sense; it would be without meaning, as if you described a Beethoven symphony as a variation of wave pressure.”

― Albert Einstein

The goal of any scientific process, as you know, requires the communication of the data that supports or refutes the hypothesis under testing.

Before it is deemed worthy of publication, it must survive the process of peer review ― where the data is laid bare before a group of experts in the field who judge the material impartially (usually) and in secret ― then pass judgment on the suitability of the information for publication.

The presentation of your data must be clear.

As such, choosing the right flow figures to communicate your data is essential.

Good handwriting (formerly known as proper penmanship) and drawings might have been enough to convince peers in the distant past, but not today.

Today, the expectation is that you’ll choose the right flow figures from all that are available, selecting the ones that reflect your data accurately and without confusion.

There is so much data and so little time that it is essential to present information in the clearest, most concise way.

As Einstein once said: “Everything must be made as simple as possible. But not simpler.”

Presenting the data in the best possible format, highlighting your results while avoiding glitz that can make the integrity of your data suspicious, is key.

At first glance, flow cytometry data is very visual.

Analysis techniques rely on presentations using univariate (a.k.a. histograms), bivariate (a.k.a. dot plots) and even higher order plots (3D plots, SPADE trees, etc.).

The huge caveat with falling in love with any of these types of plots is in knowing the plots used for flow analysis are more often than not a means to an end.

Their purpose is to extract numeric values (such as percent positive or median fluorescent intensity) from the data the real value of the data to be presented.

Here are the benefits and drawbacks of popular flow figures to consider when presenting your data:

1. Histograms.

Histograms tend to be the most abused of figures for presenting flow cytometry data.

These plots show the intensity of expression versus the number of events.

Typically, figures are shown with data from different conditions shown on one graph, often with an offset as below…

Figure_#1_Flow_Figure

Histograms are useful for cell cycle and proliferation analysis, but are less useful for presenting data for several reasons:

  • No relationship between different markers (can’t identify double positive cells)
  • Subtle populations lost in larger distribution (no rare events)
  • Shape is dependent on binning (different for different instruments and analysis tools)
  • Peak height is a function of the number of events and spread of the data

2. Scatter Graphs.

The real data that is important are the numbers extracted from these graphs. As such, scatter plots should be seen as a way to summarize the real data.

The power of the scatter graph shows several things:

  • The number of the experiments that were performed in generating the data
  • The average of the data
  • The spread of the data
  • The significance of the data
Figure_#2_Flow_Figure

3. Bivariant plots.

Bivariant plots have some utility in presenting the manner in which the populations of interest were identified.

Bivariant plots show the relationship between two different markers, allowing for more complex phenotypes to be identified and important populations of interest to be isolated via gating.

The original bivariant plot was the ‘dot plot’, a figure that showed the relationship between two variables, but lacked detail in terms of the intensity of the number of events in a given region.

4. Density Plots.

The dot plot led to the development of the ‘density’ plot ― a way to show not just expression levels, but the relative number (i.e. density) of events in a given region.

Three such density plots are shown below (generated in FlowJo v9)…

Figure_#3_Flow_Figure

Each of these plots show the same thing, just in slightly different ways, so pick the one you are most comfortable with and use it.

5. Contour Plots.

The other way to show the density of your data is to use a contour plot.  Like the above density plots, these show the relative intensity of the data using contour lines. In this case, each line contains x% (as defined by the plot).

In the plot below, the lines are at 5% of the population, so the outermost line contains 95% of the cells, the second line 90% and so on.  

The closer the lines are together, the steeper the ‘island’ of cells. Unfortunately, contour plots are not good at showing the outliers. The best strategy here is to couple a contour plot with a dot plot, allowing your rare events to be displayed (shown below in the plot on the right).

Figure_#4_Flow_Figure

One concern reviewers may have over the contour plot that can prevent your data from being published is that these plots do not convey a sense of the number of events on the plot. This is a common criticism of all bivariate plots.

As shown in this figure, only a few points make a very compelling plot (or seemingly compelling plot)…

Figure_#5_Flow_Figure

The solution to this problem is to indicate the number of events on a given plot. This will give reviewers and all readers an indication of the magnitude of the data involved in the analysis.

6. Gating Strategy (All Plots).

The gating strategy used is of great interest to the reader of a paper or grant. It is also a common criticism of flow cytometry data in general. Why? Because…

Gating is a subjective art form.  

At least, gating can be a subjective art form. In a Nature Immunology paper, Maecker and other researchers performed a series of studies concluding that…

Figure_#6_Flow_Figure

In other words…

Since the conclusions from the study will be based on the populations of interest as defined by the gating strategy, getting this consistent, and communicating how the gating strategy was established, is a critical piece of data to share.

An excellent example of this can be seen in any of the published OMIPs, such as OMIP-3 by Wei et al. (see below).

Figure_#7_Flow_Figure

The above presentation of the gating strategy is valuable for dispelling that myth that gating is a subjective art form.  

As new automated analytical techniques become more widespread, they will also help in addressing this issue while adding a level of confidence that the data extracted for downstream statistical analysis has come from a robust, vetted process.

When preparing figures for publication, the scientific question and hypothesis that forms the basis of the paper must be central and all the figures must be in support of that. The flow cytometry data that forms the basis of the conclusions should be presented clearly and concisely. While it provides pretty pictures and colorful layouts, the meat of the data are the numbers ― percentages of populations, fluorescent intensity levels and the like ― these are what will convince the reader that the hypothesis tested is valid and well thought-out.

To learn more about getting your flow cytometry data published and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.

Join Expert Cytometry's Mastery Class

ABOUT TIM BUSHNELL, PHD

Tim Bushnell holds a PhD in Biology from the Rensselaer Polytechnic Institute. He is a co-founder of—and didactic mind behind—ExCyte, the world’s leading flow cytometry training company, which organization boasts a veritable library of in-the-lab resources on sequencing, microscopy, and related topics in the life sciences.

Tim Bushnell, PhD

Similar Articles

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

By: Tim Bushnell, PhD

Numbers are all around us.  My personal favorite is ≅1.618 aka ɸ aka ‘the golden ratio’.  It’s found throughout history, where it has influenced architects and artists. We see it in nature, in plants, and it is used in movies to frame shots. It can be approximated by the Fibonacci sequence (another math favorite of mine). However, I have not worked out how to apply this to flow cytometry.  That doesn’t mean numbers aren’t important in flow cytometry. They are central to everything we do, and in this blog, I’m going to flit around numbers-based questions that I have received…

How To Do Variant Calling From RNASeq NGS Data

How To Do Variant Calling From RNASeq NGS Data

By: Deepak Kumar, PhD

Developing variant calling and analysis pipelines for NGS sequenced data have become a norm in clinical labs. These pipelines include a strategic integration of several tools and techniques to identify molecular and structural variants. That eventually helps in the apt variant annotation and interpretation. This blog will delve into the concepts and intricacies of developing a “variant calling” pipeline using GATK. “Variant calling” can also be performed using tools other than GATK, such as FREEBAYES and SAMTOOLS.  In this blog, I will walk you through variant calling methods on Illumina germline RNASeq data. In the steps, wherever required, I will…

Understanding Clinical Trials And Drug Development As A Research Scientist

Understanding Clinical Trials And Drug Development As A Research Scientist

By: Deepak Kumar, PhD

Clinical trials are studies designed to test the novel methods of diagnosing and treating health conditions – by observing the outcomes of human subjects under experimental conditions.  These are interventional studies that are performed under stringent clinical laboratory settings. Contrariwise, non-interventional studies are performed outside the clinical trial settings that provide researchers an opportunity to monitor the effect of drugs in real-life situations. Non-interventional trials are also termed observational studies as they include post-marketing surveillance studies (PMS) and post-authorization safety studies (PASS). Clinical trials are preferred for testing newly developed drugs since interventional studies are conducted in a highly monitored…

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

By: Deepak Kumar, PhD

In the first blog of this series, we explored the power of sequencing the genome at various levels. We also dealt with how the characterization of the RNA expression levels helps us to understand the changes at the genome level. These changes impact the downstream expression of the target genes. In this blog, we will explore how NGS sequencing can help us comprehend DNA modification that affect the expression pattern of the given genes (epigenetic profiling) as well as characterizing the DNA-protein interactions that allow for the identification of genes that may be regulated by a given protein.  DNA Methylation Profiling…

How To Profile DNA And RNA Expression Using Next Generation Sequencing

How To Profile DNA And RNA Expression Using Next Generation Sequencing

By: Deepak Kumar, PhD

Why is Next Generation Sequencing so powerful to explore and answer both clinical and research questions. With the ability to sequence whole genomes, identifying novel changes between individuals, to exploring what RNA sequences are being expressed, or to examine DNA modifications and protein-DNA interactions occurring that can help researchers better understand the complex regulation of transcription. This, in turn, allows them to characterize changes during different disease states, which can suggest a way to treat said disease.  Over the next two blogs, I will highlight these different methods along with illustrating how these can help clinical diagnostics as well as…

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

By: Deepak Kumar, PhD

NGS methodologies have been used to produce high-throughput sequence data. These data with appropriate computational analyses facilitate variant identification and prove to be extremely valuable in pharmaceutical industries and clinical practice for developing drug molecules inhibiting disease progression. Thus, by providing a comprehensive profile of an individual’s variome — particularly that of clinical relevance consisting of pathogenic variants — NGS helps in determining new disease genes. The information thus obtained on genetic variations and the target disease genes can be used by the Pharma companies to develop drugs impeding these variants and their disease-causing effect. However simple this may allude…

7 Key Image Analysis Terms For New Microscopist

7 Key Image Analysis Terms For New Microscopist

By: Heather Brown-Harding, PhD

As scientists, we need to perform image analysis after we’ve acquired images in the microscope, otherwise, we have just a pretty picture and not data. The vocabulary for image processing and analysis can be a little intimidating to those new to the field. Therefore, in this blog, I’m going to break down 7 terms that are key when post-processing of images. 1. RGB Image Images acquired during microscopy can be grouped into two main categories. Either monochrome (that can be multichannel) or “RGB.” RGB stands for red, green, blue – the primary colors of light. The cameras in our phones…

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

By: Tim Bushnell, PhD

In the flow cytometry community, SPADE (Spanning-tree Progression Analysis of Density-normalized Events) is a favored algorithm for dealing with highly multidimensional or otherwise complex datasets. Like tSNE, SPADE extracts information across events in your data unsupervised and presents the result in a unique visual format. Given the growing popularity of this kind of algorithm for dealing with complex datasets, we decided to test the SPADE algorithm in 5 software packages, including Cytobank, FCS Express, FlowJo, R, and the original, free software made available by the author of SPADE. Which was the fastest?

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

By: Tim Bushnell, PhD

FlowJo is a powerful tool for performing and analyzing flow cytometry experiments, if you know how to use it to the fullest. This includes understanding embedding and using keywords, the FlowJo compensation wizard, spillover spreading matrix, FlowJo and R, and creating tables in FlowJo. Extending your use of FJ using these hacks will help organize your data, improve analysis and make your exported data easier to understand and explain to others. Take a few moments and explore all you can do with FJ beyond just gating populations.

Top Industry Career eBooks

Get the Advanced Microscopy eBook

Get the Advanced Microscopy eBook

Heather Brown-Harding, PhD

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.

Get The Free Modern Flow Cytometry eBook

Get The Free Modern Flow Cytometry eBook

Tim Bushnell, PhD

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more.

Get The Free 4-10 Compensation eBook

Get The Free 4-10 Compensation eBook

Tim Bushnell, PhD

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.