The Difference Between Linear And Log Displays In Flow Cytometry

Data display is fundamental to flow cytometry and strongly influences the way that we interpret the underlying information.

One of the most important aspects of graphing flow cytometry data is the scale type. Flow cytometry data scales come in two flavors, linear and logarithmic (log), which dictate how data is organized on plots. Understanding these two scales is critical for data interpretation.

Let’s start at the beginning, where signal is generated, and trace its path all the way from the detector to the display.

Behind every flow cytometry data point is what we call a pulse. The pulse is the signal output of a detector generated as a particle transits the laser beam over time. As the cell passes through the laser beam, the intensity of the signal from the detector increases, reaches a maximum, and finally returns to baseline as the cell departs the laser beam. The entirety of this signal event is the pulse (see Figure 1).

Figure 1: The voltage pulse begins when a cell enters the laser, hits its maximum when the cell is maximally illuminated, then returns to baseline as the cell exits the beam.

This is all good, but an electrical pulse is not useful to us in and of itself. We need to extract some kind of information from it in order to measure the biological characteristics we are seeking. This is where the cytometer’s electronics (which contribute significantly to a particular cytometer model’s performance heft and price tag) come into play.

Modern instruments employ digital electronics. This means that the signal intensity over the course of a pulse is digitized by an analog-to-digital converter (ADC) before information is extracted from it.

This was not the case in the past, when most systems used analog electronics. In analog systems, the information about a pulse is calculated within the circuitry itself, and is digitized for the sole purpose of sending the data to the computer for display.

Regardless of the instrument, the type of data provided about the pulse is the same: area, height, and width (see Figure 2). These three pulse parameters are what are ultimately displayed on plots.

Figure 2: Three characteristics of the voltage pulse: area, height, and width.

Area and height are used as measurements of signal intensity, while width is often used to distinguish a single cell from two cells that passed through the laser so close together, that the cytometer classified them as one event (a doublet event).

Typically, on flow cytometry plots, you will see the axis or scale labeled with an A, H, or W denoting the pulse parameter being displayed (e.g. “FITC-A,” “FITC-H,” or “FITC-W”).

It is important to note that all of the pulse processing is performed in the cytometer electronics system, not in the computer.

The reason for this is that the required speed for processing can exceed what is possible with the computer and its ethernet connection. Given this, the cytometer passes all of the pulse measurements, already neatly processed and packaged, to the computer and cytometer software that graphs the data.

This is when plot scaling becomes important.

The range of signal levels that the cytometer transmits to the computer is extremely large, and is a function of the cytometer’s ADC. The number of bits of the ADC determines how many values comprise this range of signals.

For example, a 24-bit ADC can divide the range of signals into 16,777,216 (224) discrete values. (Note that each scatter or fluorescence channel gets its own ADC, so the number of ADCs equals the total number of parameters on the instrument.) Therefore, the dimmest FITC signal on this example instrument can be assigned a value of 1 while the brightest FITC signal can be assigned a value of 16,277,216.

Even though the granularity of each signal is assigned 224 different values, this kind of resolution is much too fine to be useful on the scales of plots.

If a histogram’s scale reflected this many values, events would be spread out among so many channels that we would need to collect millions of events to see the peaks and populations we are used to.

Furthermore, computer monitors don’t have the resolution required to draw dots on this scale. Even if they did, the dots would be so small we wouldn’t be able to see them on the screen.

The universally employed solution is to scale down the resolution on plots to a more practical, but still useful, degree.

Instead of dividing the scale into millions of units, we divide it into 256 (or, in some cases, 512) units called channels.

For a 256-channel system, we allocate all 16,277,216 digital values equally among the channels, so that each one contains 65,536 discrete values (16,277,216 divided by 256). Channel 1 can contain up to the dimmest 65,536 events, while channel 256 can contain up to the brightest 65,536 events.

This kind of scale is linear because equivalent steps in spatial distance on the scale represent linear changes in the data. As illustrated in Figure 3, moving a distance of x reflects a change of 64 channels, regardless of whether the starting point is channel 0, channel 64, or channel 192.

As such, the key feature of a linear scale is that the channels are distributed equally along the scale: the distance between channel 1 and channel 2 is the same as the distance between channel 100 and channel 101.

Figure 3: On a linear scale, channels are spaced equally.

Linear scale is certainly nice, but what happens if two populations, with very different levels of intensity, must be plotted together? This is a common situation in flow cytometry, in which nonfluorescent cells are visualized on the same plot as brightly fluorescent cells.

In this case, a plot with linear scaling becomes much less useful, as it will be very difficult to see both fluorescent and nonfluorescent cells at the same time, no matter what PMT voltage we use. Either all the nonfluorescent cells will be crammed into the first few channels, or all the fluorescent cells will be crammed into the top few channels.

This is where a logarithmic scale comes into play.

A log scale is one in which steps in spatial distance on the scale represent changes in powers of 10 (usually) in the data.

In other words, moving up a log scale by one quarter of the scale allows us to move from channel 1 to channel 10 (see Figure 4). Moving another quarter distance up the scale brings us not to channel 20 but to channel 100, a power of 10.

Figure 4: On a log scale, channels are unequally spaced so that one can visualize both high and low signals on the same plot.

Log scales are really good at facilitating visualization of data with very different medians, and are organized into decades. A four-decade log scale is marked: 101, 102, 103, 104, so it contains 10,000 channels in total.

Importantly, even though each channel itself contains the same number of digital values, data channels are not distributed equivalently across the scale.

The first decade, from 100 to 101, contains 10 channels (channel 1 to channel 10). The second decade, even though it occupies the same amount of space on the scale, contains not 10 but 90 channels (11 to 100). And, the fourth decade from 103 to 104 — occupying the same space as each other decade does — contains a whopping 9,000 channels (1001 to 10,000).

On the log scale, data is compressed to a much greater degree at the high end than it is at the low end, and it is this very property that makes it so good for visually representing data with very different medians (see Figure 5).

Figure 5: Effects of Linear vs Log scaling on resolution of 8-peak beads. The Spherotech 8-peak bead-set was run on a DIVA instrument with either Log scaling (left) or Linear scaling (right). The 8th peak was placed, on scale, at the far right of the plot. As can be seen, without log scaling of the data, the bottom 6 peaks cannot be resolved.

It is very important to keep in mind that in the digital cytometry world, these scales are solely visualization methods and, like a compensation matrix, have no effect on the underlying data. The scales are applied by the cytometry software, not the cytometry hardware.

Incidentally, this was not the case in older analog systems which applied the logarithmic transformation in the cytometer electronics using logarithmic amplifiers, so the data streamed to the computer was already “log transformed” before it got to the software.

At this point, you are probably wondering about the practicalities of these scales: when should you use linear scale and when should you use log scale?

Typically, linear scale is used for light scatter measurements (where particles differ subtly in signal intensity) and log scale is used for fluorescence (where particles differ quite starkly in signal).

However, it is not always this simple.

For most flow cytometry on mammalian cells, the range of both forward and side scatter signals generated by all particles in a single sample is not wide enough to warrant a log scale for proper visualization.

Particle size may range from a few microns to 20+ microns in a typical sample, so the entire gamut of particles would be happily on-scale using a linear scale. In fact, log scale would be counterproductive in this situation, compressing the range and making it difficult to differentiate different blood cell populations from each other, for example.

However, side scatter on a log scale can be extremely informative, especially when measuring “messy” samples with many different kinds of cell types, like those generated from dissociated solid tissues.

Additionally, make sure to use both forward and side scatter on log scale when measuring microparticles or microbiological samples like bacteria. These types of particles generate dim scatter signals that are close to the cytometer’s noise, so it’s often necessary to visualize signal on a log scale in order to separate the signal from scatter noise.

Fluorescence measurements typically involve populations that differ significantly in intensity, and thus require a log scale for visualization. This is the case when measuring signal from immunofluorescence, fluorescent proteins, viability dyes, or most functional dyes.

However, there is a major exception: cell cycle analysis. Cell cycle analysis by flow cytometry is usually accomplished by measuring DNA content via fluorescence. Cells in G2/M contain up to twice the amount of DNA found in other cells, so we need to see relatively small differences in signal intensity in order to assess cell cycle state.

Therefore, cell cycle analysis must be visualized on linear scale.

We hope this explanation sheds some light on scaling. Knowing how to properly display your data is a critical part of scientific communication. Remember to use linear scaling for most scatter parameters, or when you need to visualize small changes, and log scaling for most fluorescence parameters, or when you need to visualize a wide range of values. As always in flow cytometry, there are certainly exceptions, but armed with this knowledge, you should be able to make educated judgements about which scale types to use in various assays and to better interpret your data. Happy flowing!

To learn more about The Difference Between Linear And Log Displays In Flow Cytometry, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Flow Cytometry Mastery Class wait list.

Join Expert Cytometry's Mastery Class

ABOUT TIM BUSHNELL, PHD

Tim Bushnell holds a PhD in Biology from the Rensselaer Polytechnic Institute. He is a co-founder of—and didactic mind behind—ExCyte, the world’s leading flow cytometry training company, which organization boasts a veritable library of in-the-lab resources on sequencing, microscopy, and related topics in the life sciences.

Tim Bushnell, PhD

Similar Articles

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

Common Numbers-Based Questions I Get As A Flow Cytometry Core Manager And How To Answer Them

By: Tim Bushnell, PhD

Numbers are all around us.  My personal favorite is ≅1.618 aka ɸ aka ‘the golden ratio’.  It’s found throughout history, where it has influenced architects and artists. We see it in nature, in plants, and it is used in movies to frame shots. It can be approximated by the Fibonacci sequence (another math favorite of mine). However, I have not worked out how to apply this to flow cytometry.  That doesn’t mean numbers aren’t important in flow cytometry. They are central to everything we do, and in this blog, I’m going to flit around numbers-based questions that I have received…

How To Do Variant Calling From RNASeq NGS Data

How To Do Variant Calling From RNASeq NGS Data

By: Deepak Kumar, PhD

Developing variant calling and analysis pipelines for NGS sequenced data have become a norm in clinical labs. These pipelines include a strategic integration of several tools and techniques to identify molecular and structural variants. That eventually helps in the apt variant annotation and interpretation. This blog will delve into the concepts and intricacies of developing a “variant calling” pipeline using GATK. “Variant calling” can also be performed using tools other than GATK, such as FREEBAYES and SAMTOOLS.  In this blog, I will walk you through variant calling methods on Illumina germline RNASeq data. In the steps, wherever required, I will…

Understanding Clinical Trials And Drug Development As A Research Scientist

Understanding Clinical Trials And Drug Development As A Research Scientist

By: Deepak Kumar, PhD

Clinical trials are studies designed to test the novel methods of diagnosing and treating health conditions – by observing the outcomes of human subjects under experimental conditions.  These are interventional studies that are performed under stringent clinical laboratory settings. Contrariwise, non-interventional studies are performed outside the clinical trial settings that provide researchers an opportunity to monitor the effect of drugs in real-life situations. Non-interventional trials are also termed observational studies as they include post-marketing surveillance studies (PMS) and post-authorization safety studies (PASS). Clinical trials are preferred for testing newly developed drugs since interventional studies are conducted in a highly monitored…

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

By: Deepak Kumar, PhD

In the first blog of this series, we explored the power of sequencing the genome at various levels. We also dealt with how the characterization of the RNA expression levels helps us to understand the changes at the genome level. These changes impact the downstream expression of the target genes. In this blog, we will explore how NGS sequencing can help us comprehend DNA modification that affect the expression pattern of the given genes (epigenetic profiling) as well as characterizing the DNA-protein interactions that allow for the identification of genes that may be regulated by a given protein.  DNA Methylation Profiling…

How To Profile DNA And RNA Expression Using Next Generation Sequencing

How To Profile DNA And RNA Expression Using Next Generation Sequencing

By: Deepak Kumar, PhD

Why is Next Generation Sequencing so powerful to explore and answer both clinical and research questions. With the ability to sequence whole genomes, identifying novel changes between individuals, to exploring what RNA sequences are being expressed, or to examine DNA modifications and protein-DNA interactions occurring that can help researchers better understand the complex regulation of transcription. This, in turn, allows them to characterize changes during different disease states, which can suggest a way to treat said disease.  Over the next two blogs, I will highlight these different methods along with illustrating how these can help clinical diagnostics as well as…

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

What Is Next Generation Sequencing (NGS) And How Is It Used In Drug Development

By: Deepak Kumar, PhD

NGS methodologies have been used to produce high-throughput sequence data. These data with appropriate computational analyses facilitate variant identification and prove to be extremely valuable in pharmaceutical industries and clinical practice for developing drug molecules inhibiting disease progression. Thus, by providing a comprehensive profile of an individual’s variome — particularly that of clinical relevance consisting of pathogenic variants — NGS helps in determining new disease genes. The information thus obtained on genetic variations and the target disease genes can be used by the Pharma companies to develop drugs impeding these variants and their disease-causing effect. However simple this may allude…

7 Key Image Analysis Terms For New Microscopist

7 Key Image Analysis Terms For New Microscopist

By: Heather Brown-Harding, PhD

As scientists, we need to perform image analysis after we’ve acquired images in the microscope, otherwise, we have just a pretty picture and not data. The vocabulary for image processing and analysis can be a little intimidating to those new to the field. Therefore, in this blog, I’m going to break down 7 terms that are key when post-processing of images. 1. RGB Image Images acquired during microscopy can be grouped into two main categories. Either monochrome (that can be multichannel) or “RGB.” RGB stands for red, green, blue – the primary colors of light. The cameras in our phones…

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

We Tested 5 Major Flow Cytometry SPADE Programs for Speed - Here Are The Results

By: Tim Bushnell, PhD

In the flow cytometry community, SPADE (Spanning-tree Progression Analysis of Density-normalized Events) is a favored algorithm for dealing with highly multidimensional or otherwise complex datasets. Like tSNE, SPADE extracts information across events in your data unsupervised and presents the result in a unique visual format. Given the growing popularity of this kind of algorithm for dealing with complex datasets, we decided to test the SPADE algorithm in 5 software packages, including Cytobank, FCS Express, FlowJo, R, and the original, free software made available by the author of SPADE. Which was the fastest?

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

5 FlowJo Hacks To Boost The Quality Of Your Flow Cytometry Analysis

By: Tim Bushnell, PhD

FlowJo is a powerful tool for performing and analyzing flow cytometry experiments, if you know how to use it to the fullest. This includes understanding embedding and using keywords, the FlowJo compensation wizard, spillover spreading matrix, FlowJo and R, and creating tables in FlowJo. Extending your use of FJ using these hacks will help organize your data, improve analysis and make your exported data easier to understand and explain to others. Take a few moments and explore all you can do with FJ beyond just gating populations.

Top Industry Career eBooks

Get the Advanced Microscopy eBook

Get the Advanced Microscopy eBook

Heather Brown-Harding, PhD

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.

Get The Free Modern Flow Cytometry eBook

Get The Free Modern Flow Cytometry eBook

Tim Bushnell, PhD

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more.

Get The Free 4-10 Compensation eBook

Get The Free 4-10 Compensation eBook

Tim Bushnell, PhD

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.