Deep-learning algorithms enhance mutation detection in cancer and RNA sequencing

Deep-learning algorithms enhance mutation detection in cancer and RNA sequencing

Researchers from the Faculty of Engineering at The University of Hong Kong (HKU) have developed two innovative deep-learning algorithms, ClairS-TO and Clair3-RNA, that significantly advance genetic mutation detection in cancer diagnostics and RNA-based genomic studies.

The pioneering research team, led by Professor Ruibang Luo from the School of Computing and Data Science, Faculty of Engineering, has unveiled two groundbreaking deep-learning algorithms—ClairS-TO and Clair3-RNA—set to revolutionize genetic analysis in both clinical and research settings.

Leveraging long-read sequencing technologies, these tools significantly improve the accuracy of detecting genetic mutations in complex samples, opening new horizons for precision medicine and genomic discovery. Both research articles have been published in Nature Communications.

Long-read sequencing technologies capture continuous stretches of DNA and RNA, providing detailed insights into genetic information. However, interpreting this data, especially identifying mutations in challenging conditions, has remained a hurdle. The two new algorithms aim to overcome these obstacles, making genomic analysis faster, more accurate, and more accessible.

ClairS-TO addresses a critical challenge in cancer diagnostics: analyzing tumor DNA without needing matched healthy tissue samples. Standard methods require both tumor and normal samples for comparison, which are not always available.

Using a sophisticated dual-network approach—one to confirm genuine mutations and another to reject errors— ClairS-TO eliminates this requirement. This breakthrough allows for cost-effective, reliable tumor analysis even when sample material is limited, broadening access to precise cancer diagnostics.

Meanwhile, Clair3-RNA marks the world’s first deep-learning-based small variant caller specifically tailored for long-read RNA sequencing. RNA editing and technical sequencing errors can easily confuse the identification of true genetic variants. Clair3-RNA employs advanced deep learning techniques to accurately distinguish real mutations from biological noise and editing, enabling researchers and clinicians to simultaneously analyze gene expression and mutations with exceptional accuracy.

These algorithms are the latest additions to the renowned Clair series, a suite of artificial intelligence (AI)-driven genomic tools developed by Professor Luo’s team.

The series, including the industry-standard Clair3, has become a cornerstone in the field of computational biology. Known for their speed, accuracy, and robustness, these open-source algorithms have amassed over 400,000 downloads. They are widely adopted by leading research institutes and sequencing companies globally, setting the benchmark for processing third-generation sequencing data.

Professor Ruibang Luo commented, “ClairS-TO and Clair3-RNA, along with other algorithms in the Clair series, have established a solid foundation for deep-learning-driven genetic mutation discovery, and accelerated the adoption of precision medicine and clinical genomics.”

These advances represent a significant leap toward more accessible, accurate, and comprehensive genetic analysis. They hold the potential to improve cancer diagnosis, enable personalized medicine, and accelerate genomic research—delivering tangible benefits to patients and scientists around the world.

Share: