作者
Rachel M Sherman, Juliet Forman, Valentin Antonescu, Daniela Puiu, Michelle Daya, Nicholas Rafaels, Meher Preethi Boorgula, Sameer Chavan, Candelaria Vergara, Victor E Ortega, Albert M Levin, Celeste Eng, Maria Yazdanbakhsh, James G Wilson, Javier Marrugo, Leslie A Lange, L Keoki Williams, Harold Watson, Lorraine B Ware, Christopher O Olopade, Olufunmilayo Olopade, Ricardo R Oliveira, Carole Ober, Dan L Nicolae, Deborah A Meyers, Alvaro Mayorga, Jennifer Knight-Madden, Tina Hartert, Nadia N Hansel, Marilyn G Foreman, Jean G Ford, Mezbah U Faruque, Georgia M Dunston, Luis Caraballo, Esteban G Burchard, Eugene R Bleecker, Maria I Araujo, Edwin F Herrera-Paz, Monica Campbell, Cassandra Foster, Margaret A Taub, Terri H Beaty, Ingo Ruczinski, Rasika A Mathias, Kathleen C Barnes, Steven L Salzberg
发表日期
2019/1
期刊
Nature genetics
卷号
51
期号
1
页码范围
30-35
出版商
Nature Publishing Group US
简介
We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of the African pan-genome missing from the reference genome. Our analysis revealed 296,485,284 bp in 125,715 distinct contigs present in the populations of African descent, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome. Although the functional significance of nearly all of this sequence is unknown, 387 of the novel contigs fall within 315 distinct protein …
引用总数
20182019202020212022202320243496777774518
学术搜索中的文章