De novo gene birth
SB Van Oss, AR Carvunis - PLoS genetics, 2019 - journals.plos.org
De novo gene birth is the process by which new genes evolve from DNA sequences that
were ancestrally non-genic. De novo genes represent a subset of novel genes, and may be …
were ancestrally non-genic. De novo genes represent a subset of novel genes, and may be …
Improved global protein homolog detection with major gains in function identification
There are several hundred million protein sequences, but the relationships among them are
not fully available from existing homolog detection methods. There is an essential need for …
not fully available from existing homolog detection methods. There is an essential need for …
From de novo to “de nono”: the majority of novel protein-coding genes identified with phylostratigraphy are old genes or recent duplicates
C Casola - Genome biology and evolution, 2018 - academic.oup.com
The evolution of novel protein-coding genes from noncoding regions of the genome is one
of the most compelling pieces of evidence for genetic innovations in nature. One popular …
of the most compelling pieces of evidence for genetic innovations in nature. One popular …
An efficient protein homology detection approach based on seq2seq model and ranking
S Gao, S Yu, S Yao - Biotechnology & Biotechnological Equipment, 2021 - Taylor & Francis
Evolutionary information is essential for the protein annotation. The number of homologs of a
protein retrieved is correlated with the annotations related to the protein structure or function …
protein retrieved is correlated with the annotations related to the protein structure or function …
Deep semantic protein representation for annotation, discovery, and engineering
Computational assignment of function to proteins with no known homologs is still an
unsolved problem. We have created a novel, function-based approach to protein annotation …
unsolved problem. We have created a novel, function-based approach to protein annotation …
Protein domain embeddings for fast and accurate similarity search
Recently developed protein language models have enabled a variety of applications of the
protein contextual embeddings. Per-protein representations (each protein is represented as …
protein contextual embeddings. Per-protein representations (each protein is represented as …
Ultra-fast global homology detection with discrete cosine transform and dynamic time warping
Motivation Evolutionary information is crucial for the annotation of proteins in bioinformatics.
The amount of retrieved homologs often correlates with the quality of predicted protein …
The amount of retrieved homologs often correlates with the quality of predicted protein …
Searching for an identity: Functional characterization of taxonomically restricted genes in grain amaranth
G Cabrales-Orona, JP Délano-Frier - The amaranth genome, 2021 - Springer
Taxonomically restricted genes, or TRGs, are specific to a particular taxon that can be found
only in the genomes of single species or are represented as orthologs in closely related …
only in the genomes of single species or are represented as orthologs in closely related …
Multiple sequence alignment is not a solved problem
DA Morrison - arXiv preprint arXiv:1808.07717, 2018 - arxiv.org
Multiple sequence alignment is a basic procedure in molecular biology, and it is often
treated as being essentially a solved computational problem. However, this is not so, and …
treated as being essentially a solved computational problem. However, this is not so, and …
A new paradigm for biological sequence retrieval inspired by natural language processing and database research
AJ Rousseau, S Lemal, Y Korovin, G Triantopoulos… - bioRxiv, 2023 - biorxiv.org
Nearly-exponential growth and heterogeneity of biological sequence data make the task of
biological sequence retrieval from databases more important and challenging than ever. In …
biological sequence retrieval from databases more important and challenging than ever. In …