A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals

Y Turakhia, HI Chen, A Marcovitz… - Nucleic acids …, 2020 - academic.oup.com
Nucleic acids research, 2020academic.oup.com
Gene losses provide an insightful route for studying the morphological and physiological
adaptations of species, but their discovery is challenging. Existing genome annotation tools
focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from
genes missing annotation due to sequencing and assembly artifacts. Previous attempts to
annotate gene losses have required significant manual curation, which hampers their
scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme …
Abstract
Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (amino acid deletions and substitutions) and sister species support as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using human as reference, we discovered over 400 unique human ortholog erosion events across 58 mammals. This includes dozens of clade-specific losses of genes that result in early mouse lethality or are associated with severe human congenital diseases. Our discoveries yield intriguing potential for translational medical genetics and evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life.
Oxford University Press
以上显示的是最相近的搜索结果。 查看全部搜索结果