Effect of Co-segregating Markers on High-Density Genetic Maps and Prediction of Map Expansion Using Machine Learning Algorithms

共分离标记对高密度遗传图谱的影响以及使用机器学习算法预测图谱扩展

阅读:9
作者:Amidou N'Diaye, Jemanesh K Haile, D Brian Fowler, Karim Ammar, Curtis J Pozniak

Abstract

Advances in sequencing and genotyping methods have enable cost-effective production of high throughput single nucleotide polymorphism (SNP) markers, making them the choice for linkage mapping. As a result, many laboratories have developed high-throughput SNP assays and built high-density genetic maps. However, the number of markers may, by orders of magnitude, exceed the resolution of recombination for a given population size so that only a minority of markers can accurately be ordered. Another issue attached to the so-called 'large p, small n' problem is that high-density genetic maps inevitably result in many markers clustering at the same position (co-segregating markers). While there are a number of related papers, none have addressed the impact of co-segregating markers on genetic maps. In the present study, we investigated the effects of co-segregating markers on high-density genetic map length and marker order using empirical data from two populations of wheat, Mohawk × Cocorit (durum wheat) and Norstar × Cappelle Desprez (bread wheat). The maps of both populations consisted of 85% co-segregating markers. Our study clearly showed that excess of co-segregating markers can lead to map expansion, but has little effect on markers order. To estimate the inflation factor (IF), we generated a total of 24,473 linkage maps (8,203 maps for Mohawk × Cocorit and 16,270 maps for Norstar × Cappelle Desprez). Using seven machine learning algorithms, we were able to predict with an accuracy of 0.7 the map expansion due to the proportion of co-segregating markers. For example in Mohawk × Cocorit, with 10 and 80% co-segregating markers the length of the map inflated by 4.5 and 16.6%, respectively. Similarly, the map of Norstar × Cappelle Desprez expanded by 3.8 and 11.7% with 10 and 80% co-segregating markers. With the increasing number of markers on SNP-chips, the proportion of co-segregating markers in high-density maps will continue to increase making map expansion unavoidable. Therefore, we suggest developers improve linkage mapping algorithms for efficient analysis of high-throughput data. This study outlines a practical strategy to estimate the IF due to the proportion of co-segregating markers and outlines a method to scale the length of the map accordingly.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。