Pangenome graphs

JM Eizenga, AM Novak, JA Sibbesen… - Annual review of …, 2020 - annualreviews.org
Low-cost whole-genome assembly has enabled the collection of haplotype-resolved
pangenomes for numerous organisms. In turn, this technological change is encouraging the …

[PDF][PDF] Computational pan-genomics: status, promises and challenges

Briefings in bioinformatics, 2018 - academic.oup.com
Many disciplines, from human genetics and oncology to plant breeding, microbiology and
virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes …

The design and construction of reference pangenome graphs with minigraph

H Li, X Feng, C Chu - Genome biology, 2020 - Springer
The recent advances in sequencing technologies enable the assembly of individual
genomes to the quality of the reference genome. How to integrate multiple genomes from …

Indexing highly repetitive string collections, part II: Compressed indexes

G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …

Fully functional suffix trees and optimal text searching in BWT-runs bounded space

T Gagie, G Navarro, N Prezza - Journal of the ACM (JACM), 2020 - dl.acm.org
Indexing highly repetitive texts—such as genomic databases, software repositories and
versioned text collections—has become an important problem since the turn of the …

Indexing graphs for path queries with applications in genome research

J Sirén, N Välimäki, V Mäkinen - IEEE/ACM transactions on …, 2014 - ieeexplore.ieee.org
We propose a generic approach to replace the canonical sequence representation of
genomes with graph representations, and study several applications of such extensions. We …

[HTML][HTML] Wavelet trees for all

G Navarro - Journal of Discrete Algorithms, 2014 - Elsevier
The wavelet tree is a versatile data structure that serves a number of purposes, from string
processing to computational geometry. It can be regarded as a device that represents a …

At the roots of dictionary compression: string attractors

D Kempa, N Prezza - Proceedings of the 50th Annual ACM SIGACT …, 2018 - dl.acm.org
A well-known fact in the field of lossless text compression is that high-order entropy is a
weak model when the input contains long repetitions. Motivated by this fact, decades of …

[图书][B] Genome-scale algorithm design

V Mäkinen, D Belazzougui, F Cunial, AI Tomescu - 2015 - books.google.com
High-throughput sequencing has revolutionised the field of biological sequence analysis. Its
application has enabled researchers to address important biological questions, often for the …

When less is more: sketching with minimizers in genomics

M Ndiaye, S Prieto-Baños, LM Fitzgerald… - Genome Biology, 2024 - Springer
The exponential increase in sequencing data calls for conceptual and computational
advances to extract useful biological insights. One such advance, minimizers, allows for …