Enhancing RNA-seq bias mitigation with the Gaussian self-benchmarking framework: towards unbiased sequencing data

利用高斯自基准测试框架增强RNA-seq偏差缓解:迈向无偏测序数据

阅读:10
作者:Qiang Su # ,Yi Long # ,Deming Gou ,Junmin Quan ,Qizhou Lian

Abstract

Background: RNA sequencing is a vital technique for analyzing RNA behavior in cells, but it often suffers from various biases that distort the data. Traditional methods to address these biases are typically empirical and handle them individually, limiting their effectiveness. Our study introduces the Gaussian Self-Benchmarking (GSB) framework, a novel approach that leverages the natural distribution patterns of guanine (G) and cytosine (C) content in RNA to mitigate multiple biases simultaneously. This method is grounded in a theoretical model, organizing k-mers based on their GC content and applying a Gaussian model for alignment to ensure empirical sequencing data closely match their theoretical distribution. Results: The GSB framework demonstrated superior performance in mitigating sequencing biases compared to existing methods. Testing with synthetic RNA constructs and real human samples showed that the GSB approach not only addresses individual biases more effectively but also manages co-existing biases jointly. The framework's reliance on accurately pre-determined parameters like mean and standard deviation of GC content distribution allows for a more precise representation of RNA samples. This results in improved accuracy and reliability of RNA sequencing data, enhancing our understanding of RNA behavior in health and disease. Conclusions: The GSB framework presents a significant advancement in RNA sequencing analysis by providing a well-validated, multi-bias mitigation strategy. It functions independently from previously identified dataset flaws and sets a new standard for unbiased RNA sequencing results. This development enhances the reliability of RNA studies, broadening the potential for scientific breakthroughs in medicine and biology, particularly in genetic disease research and the development of targeted treatments. Keywords: Bias mitigation; GC content; Gaussian self-benchmarking (GSB); RNA sequencing.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。