How To Profile DNA And RNA Expression Using Next Generation Sequencing (Part-2)

In the first blog of this series, we explored the power of sequencing the genome at various levels. We also dealt with how the characterization of the RNA expression levels helps us to understand the changes at the genome level. These changes impact the downstream expression of the target genes. In this blog, we will explore how NGS sequencing can help us comprehend DNA modification that affect the expression pattern of the given genes (epigenetic profiling) as well as characterizing the DNA-protein interactions that allow for the identification of genes that may be regulated by a given protein. 

DNA Methylation Profiling or Epigenetic Profiling

NGS can be adapted to profile DNA methylation either through an enrichment (using methyl CpG antibody or methyl-CpG-binding protein) or by bisulfite sequencing.

Figure 1: Different methods of NGS DNA Methylation profiling.

1. Bisulfite Sequencing

Bisulfite treatment of DNA converts unmethylated cytosines to uracil, while methylated cytosines remain the same. Uracil bases are then identified as thymine in the sequencing data, which could be used to identify the location and percentage of methylated cytosines. NGS-based bisulfite sequencing — whether whole-genome or targeted — makes it possible to profile genome-wide cytosine methylation at single-base resolution.

Types of Bisulfites sequencing:

a. Whole-Genome Bisulfite Sequencing (WGBS)

Currently, WGBS is the most comprehensive way to profile DNA methylation at base-pair resolution. However, the required depth (minimum 30x) makes it cost-prohibitive. Thus, other enrichment methods have been devised to reduce the cost of methylation profiling, especially when 100% coverage or base-pair resolution is not necessary.

b. Reduced Representation Bisulfite Sequencing (RRBS)

RRBS relies on restriction enzymes such as MspI (CCGG) or BglII (AGATCT), which tend to cut inside or near CpG islands and promoter regions regardless of methylation status. Subsequently, fragments between 40 – 220 bp are isolated and end-repaired, then treated with bisulfite and amplified with PCR. RRBS using MspI captures approximately 80% of CpG islands and 60% of promoter regions in human genomes.

2. Methylated DNA-enriched Sequencing

a. MethyCap-Seq

This sequencing uses the Methyl-CpG-binding (MBD) domain of MeCP2 to capture methylated DNA on magnetic beads. After the captured DNA is enriched with magnetic capture, the bound DNA is eluted with a high-salt solution and then used for NGS. While this is a cost-effective method, the current resolution is ~150 bp, so it is suitable for fast, large-scale, and low-resolution studies.

b. Methylated DNA Immunoprecipitation-Seq (MeDIP-Seq)

It uses an anti-methylcytosine antibody to immunoprecipitate DNA with methyl CpG. While MeDIP-Seq can be relatively inexpensive, it can yield resolutions of between 100 – 300 bp.

DNA-protein Interaction Profiling

Due to the quantitative nature of NGS, chromatin immunoprecipitation-enriched DNA can be sequenced with NGS to profile any genomic regions bound by the proteins of interest that can either be recognized with an antibody or tagged with an epitope. These include DNA-binding proteins, transcription factors, histones, histone variants, specific histone modifications, and nucleosomes.

1. ChIP-Seq (Chromatin Immunoprecipitation Sequencing)

To create a ChIP enriched library, DNA-bound proteins are cross-linked to DNA using formaldehyde, before the chromatin is cleaved. The sample is then enriched using immunoprecipitation with an antibody specific to the protein or protein modification of interest. Subsequently, the crosslinks are reversed, and then the ChIP enriched library can be assayed using quantitative PCR, microarray, or NGS. 

Difference between ChIP-chip Vs. ChIP-Seq

ChIP-chip resolution is limited by the probes’ fragment sizes on the arrays, whereas ChIP-Seq can provide single-nucleotide resolution. ChIP-Seq requires much less input DNA and provides signals with an unlimited dynamic range, depending on the sequencing depth. Additionally, ChIP-Seq makes it possible to profile repetitive regions – these are often omitted from the microarrays. Repetitive regions that are often important for epigenetic control, such as heterochromatin or microsatellites, may only be mapped with NGS.

In addition to identifying genomic regions bound by the proteins, ChIP-Seq can provide insights into the functions of the DNA-bound proteins themselves. For example, ChIP-Seq data can be used to identify the cognate binding motifs of the DNA-binding proteins. This sequence data can also be used to globally infer distances between the binding sites and genomic features, such as transcription start sites, exon-intron boundaries, 3’end of genes, and from other known binding sites.

Figure 2:  A representation of Chip sequencing

  1. Micrococcal Nuclease-Seq (MNase-Seq)

Nucleosome occupancy can tell us about regions of active genes and chromatin structure in eukaryotes. NGS allows us to profile the nucleosome occupancy by sequencing the micrococcal nuclease (MNase)-digested genomic DNA. MNase prefers to digest linker DNA between histone octamers unoccupied by other proteins.

Figure 3: The workflow of an MNase protection assay

DNA is crosslinked to the protein using formaldehyde before MNase digestion. Once the digestion step is complete, the crosslinks are reversed. Then, the digested DNA is run on a gel to select the desired digested products, which are then purified and subsequently used for NGS. To control for MNase sequence bias, GC/AT preference, and other technical biases, it is necessary to concurrently sequence the genomic DNA from the same sample without crosslinking – and compare them during the analysis process.

Concluding Remarks

Over the course of these two blog posts, we have explored the power of NGS sequencing at several levels, from whole-genome sequencing, down to characterizing epigenetic differences that impact gene expression. NGS sequencing allows scientists to get a deeper holistic understanding of the genome, and variations that may be markers for the disease.  No other technique can provide such a complete picture in a relatively short time frame. As costs continue to decrease, these techniques will continue to have a greater role in areas such as drug discovery, clinical diagnostics, and ultimately personalized medicine.  Stay tuned to this blog for more information on these and many other techniques being developed in the world of NGS sequencing.   

To learn more about gene prediction and how NGS can assist you, and to get access to all of our advanced materials including 20 training videos, presentations, workbooks, and private group membership, get on the Expert Sequencing wait list.

Join Expert Cytometry's Mastery Class
Deepak Kumar, PhD
Deepak Kumar, PhD Genomics Software Application Engineer

Deepak Kumar is a Genomics Software Application Engineer (Bioinformatics) at Agilent Technologies. He is the founder of the Expert Sequencing Program (ExSeq) at Cheeky Scientist. The ExSeq program provides a holistic understanding of the Next Generation Sequencing (NGS) field - its intricate concepts, and insights on sequenced data computational analyses. He holds diverse professional experience in Bioinformatics and computational biology and is always keen on formulating computational solutions to biological problems.

Similar Articles

How To Do Variant Calling From RNASeq NGS Data

How To Do Variant Calling From RNASeq NGS Data

By: Deepak Kumar, PhD

Developing variant calling and analysis pipelines for NGS sequenced data have become a norm in clinical labs. These pipelines include a strategic integration of several tools and techniques to identify molecular and structural variants. That eventually helps in the apt variant annotation and interpretation. This blog will delve into the concepts and intricacies of developing a “variant calling” pipeline using GATK. “Variant calling” can also be performed using tools other than GATK, such as FREEBAYES and SAMTOOLS.  In this blog, I will walk you through variant calling methods on Illumina germline RNASeq data. In the steps, wherever required, I will…

How small can you go? Flow cytometry of bacteria and viruses

How small can you go? Flow cytometry of bacteria and viruses

By: Tim Bushnell, PhD

Flow cytometers are traditionally designed for measuring particles, like beads and cells. These tend to fall in the small micron size range. Looking at the relative size of different targets of biological interest, it is clear the most common targets for flow cytometry (cells) are comparatively large (figure 1). Figure 1:  Relative size of different biological targets of interest. Image modified from Bioninja.    In the visible spectrum, where most of the excitation light sources reside, it is clear the cells are larger than the light. This is important as one of the characteristics that we typically measure is the amount…

What Is Spectral Unmixing And Why It's Important In Flow Cytometry

What Is Spectral Unmixing And Why It's Important In Flow Cytometry

By: Tim Bushnell, PhD

As the labeled cell passes through the interrogation point, it is illuminated by the excitation lasers. The fluorochromes, fluoresce; emitting photons of a higher wavelength than the excitation source. This is typically modeled using spectral viewers such as in the figure below, which shows the excitation (dashed lines) and emission (filled curves) for Brilliant Violet 421TM (purple) and Alexa Fluor 488Ⓡ (green).  Figure 1: Excitation and emission profiles of BV421TM and AF488Ⓡ  In traditional fluorescent flow cytometry (TFF), the instrument measures each fluorochrome off an individual detector. Since the detectors we use — photomultiplier tubes (PMT) and avalanche photodiodes (APD)…

How To Extract Cells From Tissues Using Laser Capture Microscopy

How To Extract Cells From Tissues Using Laser Capture Microscopy

By: Tim Bushnell, PhD

Extracting specific cells still remains an important aspect of several emerging genomic techniques. Prior knowledge about the input cells helps to put the downstream results in context. The most common isolation technique is cell sorting, but it requires a single cell suspension and eliminates any spatial information about the microenvironment. Spatial transcriptomics is an emerging technique that can address some of these issues, but that is a topic for another blog.  So what does a researcher who needs to isolate a specific type of cell do? The answer lies in the technique of laser capture microdissection (LCM). Developed at the National…

The Importance Of Quality Control And Quality Assurance In Flow Cytometry (Part 4 Of 6)

The Importance Of Quality Control And Quality Assurance In Flow Cytometry (Part 4 Of 6)

By: Tim Bushnell, PhD

Incorporating quality control as a part of the optimization process in  your flow cytometry protocol is important. Take a step back and consider how to build quality control tracking into the experimental protocol.  When researchers hear about quality control, they immediately shift their attention to those operating and maintaining the instrument, as if the whole weight of QC should fall on their shoulders.   It is true that core facilities work hard to provide high-quality instruments and monitor performance over time so that the researchers can enjoy uniformity in their experiments. That, however, is just one level of QC.  As the experimental…

Understanding Clinical Trials And Drug Development As A Research Scientist

Understanding Clinical Trials And Drug Development As A Research Scientist

By: Deepak Kumar, PhD

Clinical trials are studies designed to test the novel methods of diagnosing and treating health conditions – by observing the outcomes of human subjects under experimental conditions.  These are interventional studies that are performed under stringent clinical laboratory settings. Contrariwise, non-interventional studies are performed outside the clinical trial settings that provide researchers an opportunity to monitor the effect of drugs in real-life situations. Non-interventional trials are also termed observational studies as they include post-marketing surveillance studies (PMS) and post-authorization safety studies (PASS). Clinical trials are preferred for testing newly developed drugs since interventional studies are conducted in a highly monitored…

How To Optimize Instrument Voltage For Flow Cytometry Experiments  (Part 3 Of 6)

How To Optimize Instrument Voltage For Flow Cytometry Experiments (Part 3 Of 6)

By: Tim Bushnell, PhD

As we continue to explore the steps involved in optimizing a flow cytometry experiment, we turn our attention to the detectors and optimizing sensitivity: instrument voltage optimization.  This is important as we want to ensure that we can make as sensitive a measurement as possible.  This requires us to know the optimal sensitivity of our instrument, and how our stained cells are resolved based on that voltage.  Let’s start by asking the question what makes a good voltage?  Joe Trotter, from the BD Biosciences Advanced Technology Group, once suggested the following:  Electronic noise effects resolution sensitivity   A good minimal PMT…

How To Profile DNA And RNA Expression Using Next Generation Sequencing

How To Profile DNA And RNA Expression Using Next Generation Sequencing

By: Deepak Kumar, PhD

Why is Next Generation Sequencing so powerful to explore and answer both clinical and research questions. With the ability to sequence whole genomes, identifying novel changes between individuals, to exploring what RNA sequences are being expressed, or to examine DNA modifications and protein-DNA interactions occurring that can help researchers better understand the complex regulation of transcription. This, in turn, allows them to characterize changes during different disease states, which can suggest a way to treat said disease.  Over the next two blogs, I will highlight these different methods along with illustrating how these can help clinical diagnostics as well as…

Optimizing Flow Cytometry Experiments - Part 2         How To Block Samples (Sample Blocking)

Optimizing Flow Cytometry Experiments - Part 2 How To Block Samples (Sample Blocking)

By: Tim Bushnell, PhD

In my previous blog on  experimental optimization, we discussed the idea of identifying the best antibody concentration for staining the cells. We did this through a process called titration, which  focuses on finding the best signal-to-noise ratio at the lowest antibody concentration. In this blog we will deal with sample blocking As a reminder, there are two other major binding concerns with antibodies. The first is the specific binding of the Fc fragment of the antibody to the Fc Receptor expressed on some cells. This protein is critical for the process of destroying microbes or other cells that have been…

Top Technical Training eBooks

Get the Advanced Microscopy eBook

Get the Advanced Microscopy eBook

Heather Brown-Harding, PhD

Learn the best practices and advanced techniques across the diverse fields of microscopy, including instrumentation, experimental setup, image analysis, figure preparation, and more.

Get The Free Modern Flow Cytometry eBook

Get The Free Modern Flow Cytometry eBook

Tim Bushnell, PhD

Learn the best practices of flow cytometry experimentation, data analysis, figure preparation, antibody panel design, instrumentation and more.

Get The Free 4-10 Compensation eBook

Get The Free 4-10 Compensation eBook

Tim Bushnell, PhD

Advanced 4-10 Color Compensation, Learn strategies for designing advanced antibody compensation panels and how to use your compensation matrix to analyze your experimental data.