Three Main Challenges in Omics Data Analysis and the Future of Bioinformatics – Interview with Ming Tommy Tang

Published on December 12th, 2023
 9 min read

Scientist looking at data

Introduction

There are several challenging bottlenecks in omics data analysis. However, the most pressing three challenges are: data processing, inefficient bioinformatics infrastructure and collaboration between different teams. We discussed these points and more in a recent webinar with an expert in the field of bioinformatics, Ming Tommy Tang, Director of Computational Biology at Immunitas. 

He shared his experiences and perspectives on the evolving landscape of computational biology, and how in the future, AI can help overcome these bottlenecks.  Continue reading or watch the recording of our interview below.

Tommy completed his undergraduate studies at Shanghai Jiao Tong University and his postdoctoral studies at the University of Florida. He held research positions at MD Anderson, Harvard University and Dana-Faber Cancer Institute before taking his current position as Director of Computational Biology at Immunitas. A decade ago, he started sharing his knowledge publicly on social media, inspired by a blog post by Stephen Turner about staying up to date in bioinformatics.

After moving from academia to industry, he noticed a big difference between the two worlds. In industry, there is a common goal of drug development, requiring team members to work closely together. In comparison, the academic world focuses more on individual success; it is primarily about publishing in a reputable journal as the first author. 

Forced by the need to analyze large amounts of data, Tommy moved from wet lab science to computational biology. Initially, he was motivated by a desire to avoid having to depend on someone else to analyze his data, but he quickly developed a real passion for data analysis and programming. 

With the evolving role of computational biology and its impact on the field, Tommy sees three important trends. 

Top Challenges in Omics Data Analysis

The Need for an Improved Infrastructure for Routine Data Pre-processing

Roads Traffic

As a first challenge, Tommy mentioned the need for improved infrastructure for routine data pre-processing. Making data analysis accessible and providing the necessary bioinformatics infrastructure is currently a challenge. 

In his experience, programming skills are currently essential for modern biology, as effective experimental design and data analysis of large-scale experiments require collaboration with computer experts. He thinks we need to democratize data analysis to make it accessible to people without advanced programming skills.

The Importance of Harmonizing Public Data for More Efficient Analysis

The second notable trend Tommy sees is the importance of harmonizing public data for more efficient analysis. It is important to explore and standardize challenges and approaches to data generation, metadata and notation before moving on to advanced machine learning methods, which involves knowing the pitfalls of large language models and understanding their fundamentals in order to recognize errors.

The potential of AI lies in the integration of different types of omics data and the use of large language models such as GPT for analysis. Good data quality is critical for effective AI model performance, and Tommy advocates starting with simple statistical methods before moving on to complex AI approaches. He emphasizes the importance of simple methods and that task-specific approaches are preferable in certain scenarios.

The Challenges to Collaborate

Collaboration between different team members can be a real challenge for omics data analysis. These involve bioinformaticians and computational biologists but in translational medicine and drug research, biologists and management staff such as principal investigators, project managers and senior leaders are also often involved. Due to the large number of people working on a project, it can therefore take weeks to months to fully analyze and interpret data and often requires iterative discussions between stakeholders to understand the underlying biological mechanisms.

Omics Playground, the data analysis and visualization platform from BigOmics, is particularly suited to improving these complex communication processes. We’ve highlighted the crucial role of collaboration in the life sciences industry several times in previous blog articles, and this is in line with Tommy’ experience. For him, “Omics Playground” provides a solution to these challenges, bridging the gap between biologists and bioinformaticians.

Tommy has personally tested and recently reviewed the BigOmics platform:

“The feature that appeals to me the most about this platform is the interactive features of many plots generated. For example, it is easy to upload customized gene sets and highlight them in the volcano plot; the heatmaps are interactive (see the figure below), and one can zoom in and out or hover over the mouse to get the gene information.” [1]

 

At BigOmics, we have designed our data analysis and visualization platform Omics Playground to be used in both academic and industrial settings – for a wide range of collaborations. The platform eliminates the need for programming skills and enables data analysis and visualization for a much wider audience. “Another noteworthy aspect of the Omics Playground platform is its ability to make many comparisons simultaneously”, Tommy explains.

The biggest challenges for bioinformatics in the future will include more different data types, more data volumes and higher throughput – too much for manual processing. At the same time, developments in AI will enable even more complex analyses in the future.

References

[1]
Omics Playground: Derive biological insights from your omics data at your fingertip

https://divingintogeneticsandgenomics.com/post/omics-playground-derive-biological-insights-from-your-omics-data-at-your-fingertip/

For more insights and support, try Omics Playground for yourself!