Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across
the tree of life imposes fundamental challenges for sequencing, genome assembly, and …
the tree of life imposes fundamental challenges for sequencing, genome assembly, and …
The road to metagenomics: from microbiology to DNA sequencing technologies and bioinformatics
A Escobar-Zepeda, A Vera-Ponce de León… - Frontiers in …, 2015 - frontiersin.org
The study of microorganisms that pervade each and every part of this planet has
encountered many challenges through time such as the discovery of unknown organisms …
encountered many challenges through time such as the discovery of unknown organisms …
RepeatModeler2 for automated genomic discovery of transposable element families
The accelerating pace of genome sequencing throughout the tree of life is driving the need
for improved unsupervised annotation of genome components such as transposable …
for improved unsupervised annotation of genome components such as transposable …
MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization
K Katoh, J Rozewicki, KD Yamada - Briefings in bioinformatics, 2019 - academic.oup.com
This article describes several features in the MAFFT online service for multiple sequence
alignment (MSA). As a result of recent advances in sequencing technologies, huge numbers …
alignment (MSA). As a result of recent advances in sequencing technologies, huge numbers …
The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication
High oil and protein content make tetraploid peanut a leading oil and food legume. Here we
report a high-quality peanut genome sequence, comprising 2.54 Gb with 20 …
report a high-quality peanut genome sequence, comprising 2.54 Gb with 20 …
The Ensembl gene annotation system
The Ensembl gene annotation system has been used to annotate over 70 different
vertebrate species across a wide range of genome projects. Furthermore, it generates the …
vertebrate species across a wide range of genome projects. Furthermore, it generates the …
A collinearity-incorporating homology inference strategy for connecting emerging assemblies in the triticeae tribe as a pilot practice in the plant pangenomic era
Plant genome sequencing has dramatically increased, and some species even have
multiple high-quality reference versions. Demands for clade-specific homology inference …
multiple high-quality reference versions. Demands for clade-specific homology inference …
The genome of Chenopodium quinoa
Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to
improve world food security. Unfortunately, few resources are available to facilitate its …
improve world food security. Unfortunately, few resources are available to facilitate its …
A simple method to control over-alignment in the MAFFT multiple sequence alignment program
K Katoh, DM Standley - Bioinformatics, 2016 - academic.oup.com
Motivation: We present a new feature of the MAFFT multiple alignment program for
suppressing over-alignment (aligning unrelated segments). Conventional MAFFT is highly …
suppressing over-alignment (aligning unrelated segments). Conventional MAFFT is highly …
Using intron position conservation for homology-based gene prediction
Annotation of protein-coding genes is very important in bioinformatics and biology and has a
decisive influence on many downstream analyses. Homology-based gene prediction …
decisive influence on many downstream analyses. Homology-based gene prediction …