Large multiple sequence alignments with a root-to-leaf regressive method

R Guigó - Cell Genomics, 2023 - cell.com

Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced.
Identifying genes in these sequences is essential to understand the biology of the species …

被引用次数：14 相关文章所有 7 个版本

[PDF] springer.com

Reference flow: reducing reference bias using multiple population genomes

NC Chen, B Solomon, T Mun, S Iyer, B Langmead - Genome biology, 2021 - Springer

Most sequencing data analyses start by aligning sequencing reads to a linear reference
genome, but failure to account for genetic variation leads to reference bias and confounding …

被引用次数：80 相关文章所有 15 个版本

Towards the accurate alignment of over a million protein sequences: Current state of the art

L Santus, E Garriga, S Deorowicz, A Gudyś… - Current Opinion in …, 2023 - Elsevier

Large-scale genomics requires highly scalable and accurate multiple sequence alignment
methods. Results collected over this last decade suggest accuracy loss when scaling up …

被引用次数：7 相关文章所有 4 个版本

[PDF] oup.com

MAGUS: multiple sequence alignment using graph clustering

V Smirnov, T Warnow - Bioinformatics, 2021 - academic.oup.com

Motivation The estimation of large multiple sequence alignments (MSAs) is a basic
bioinformatics challenge. Divide-and-conquer is a useful approach that has been shown to …

被引用次数：55 相关文章所有 10 个版本

[PDF] cshlp.org Free from Publisher

Leveraging protein language models for accurate multiple sequence alignments

CD McWhite, I Armour-Garb, M Singh - Genome Research, 2023 - genome.cshlp.org

Multiple sequence alignment (MSA) is a critical step in the study of protein sequence and
function. Typically, MSA algorithms progressively align pairs of sequences and combine …

被引用次数：12 相关文章所有 7 个版本

[PDF] oup.com

Phylogeny estimation given sequence length heterogeneity

V Smirnov, T Warnow - Systematic biology, 2021 - academic.oup.com

Phylogeny estimation is a major step in many biological studies, and has many well known
challenges. With the dropping cost of sequencing technologies, biologists now have …

被引用次数：33 相关文章所有 10 个版本

[PDF] oup.com

UPP2: fast and accurate alignment of datasets with fragmentary sequences

M Park, S Ivanovic, G Chu, C Shen, T Warnow - Bioinformatics, 2023 - academic.oup.com

Motivation Multiple sequence alignment (MSA) is a basic step in many bioinformatics
pipelines. However, achieving highly accurate alignments on large datasets, especially …

被引用次数：9 相关文章所有 9 个版本

[PDF] oup.com Full View

learnMSA: learning and aligning large protein families

F Becker, M Stanke - GigaScience, 2022 - academic.oup.com

Background The alignment of large numbers of protein sequences is a challenging task and
its importance grows rapidly along with the size of biological datasets. State-of-the-art …

被引用次数：6 相关文章所有 8 个版本

[PDF] plos.org

Recursive MAGUS: scalable and accurate multiple sequence alignment

V Smirnov - PLoS Computational Biology, 2021 - journals.plos.org

Multiple sequence alignment tools struggle to keep pace with rapidly growing sequence
data, as few methods can handle large datasets while maintaining alignment accuracy. We …

被引用次数：14 相关文章所有 13 个版本

[PDF] oup.com

learnMSA2: deep protein multiple alignments with large language and hidden Markov models

F Becker, M Stanke - Bioinformatics, 2024 - academic.oup.com

Motivation For the alignment of large numbers of protein sequences, tools are predominant
that decide to align two residues using only simple prior knowledge, eg amino acid …