Steps in analyses of gene expression data and popular differential gene expression analysis methods
Published on January 30th, 2024
by Antonino Zito, PhD
⏱ 6 min read
Differential Gene Expression Analysis aims to detect features (i.e., genes) exhibiting substantial differences in the levels of gene expression between conditions. Differential gene expression testing is a key component of the discovery process in biological research.
It provides scientists with a powerful tool to gain valuable information on possible molecular factors and mechanisms underlying biology in health and disease.
At BigOmics we employ differential gene expression testing at a large scale to identify putative disease-associated features and genes perturbed upon drug treatments. In this blog post, we provide a basic overview of differential gene expression methods as an integral part of a common bioinformatic pipeline [Figure 1].
In Figure 1 above, we show the typical steps in analyses of gene expression data:
1. Data collection. Microarrays or RNA-seq are the most employed techniques to measure gene expression levels of thousands of genes per sample.
2. Data processing. Raw gene expression data usually need to be appropriately cleaned to remove noise, technical and unwanted biological effects. This also involves normalization, a tailored mathematical approach to enable valid comparison between groups. Curious and want to hear more about normalization and why it’s so important? Read our tech blog dedicated on normalization.
3. Differential gene expression. Differentially expressed genes (DEGs) are genes exhibiting significant changes in expression between experimental groups. DEGs can be identified with distinct statistical methods that assess both magnitude and statistical significance of the difference between groups (i.e., ‘fold-change’ and ‘p-value’, respectively) [Fig.2]
4. Functional analyses of discovery set. Differential Gene Expression analyses may result in a list of DEGs which collectively define the ‘discovery set’. Researchers carefully search in the discovery set for genes known to be associated with the phenotype or condition of interest, as well as for potentially new associations. New associations would ideally need validation with molecular biology assays (e.g., PCR). Regardless, functional enrichment analyses are often conducted to gain knowledge on the possible biological roles of the discovery set at the cellular level. For instance, it‘s interesting to assess whether genes are enriched of known biological functions and pathways, or are distinctively enriched for disease-associated features.
With the advance of Bioinformatics and Computational Biology, numerous methods for differential gene expression testing have been developed. Each method has its own strenghts and weaknesses, and its applicability depends on data type, size of the dataset, availability of replicates, and type of test needed to address the proposed questions.
At BigOmics, we are very careful on how to properly conduct differential gene expression analysis. Below we list highly popular methods for differential gene expression testing, available in Omics Playground.
Aside from these highly popular methods, numerous others have been developed. Researchers often evaluate different methods and select the appropriate one based on their needs.
Omics Playground is equipped with 9 distinct differential gene expression methods, covering the most disparate experimental conditions. It‘s our priority to offer researchers of any background a vast range of choices to study in detail their data, in the fastest possible time, and without requiring any coding. Our differential gene expression workflow is paralleled with extensive visualizations including Volcano plots, Box and Bar plots, and Heatmaps, and functional enrichment testing of biological pathways.
Video 1. Expression analysis options in Omics Playground.
As mentioned, Omics Playground is equipped with 9 differential gene expression methods. Users just need to upload their data and select which statistical methods to use [Figure 3].
The Omics Playground platform is capable of integrating results from multiple differential gene expression methods to provide researchers with greater robustness in the results. For example, in a common differential gene expression analysis, it reports multiple statistics including (i) a ‘meta q-value’ per each gene from combination of the distinct DGE p-values; (ii) a ‘Star classification’ informing on how many of the chosen statistical methods identify a gene as dysregulated [Figure 4].
Researchers can refine the discovery set by adjusting statistical parameters and so can see how many significantly dysregulated genes are detected at the different thresholds. Researchers also have the power to select individual genes and visualize their profiles across experimental groups, and extract data as tables reporting on pathways and gene sets in which the gene is annotated [Figure 4].
Omics Playground makes Bioinformatics analysis accessible to everyone, regardless of their programming skills. It also provides support to Bioinformaticians looking to delegate more routine Omics data analysis to biologists, in a win-to-win scenario for both parties.
Perform differential gene expression analysis interactively with a free trial of Omics Playground

Antonino is a senior bioinformatics engineer at BigOmics with a strong background in bioinformatics and biostatistics. With a PhD in genetics and bioinformatics and an MSc in biotechnology, he has made significant contributions to computational analysis in numerous projects during his previous research at Harvard Medical School and King’s College London.
