Pangenome graphs

JM Eizenga, AM Novak, JA Sibbesen… - Annual review of …, 2020 - annualreviews.org
Low-cost whole-genome assembly has enabled the collection of haplotype-resolved
pangenomes for numerous organisms. In turn, this technological change is encouraging the …

Computational graph pangenomics: a tutorial on data structures and their applications

JA Baaijens, P Bonizzoni, C Boucher… - Natural Computing, 2022 - Springer
Computational pangenomics is an emerging research field that is changing the way
computer scientists are facing challenges in biological sequence analysis. In past decades …

Haplotype-aware graph indexes

J Sirén, E Garrison, AM Novak, B Paten… - Bioinformatics, 2020 - academic.oup.com
Motivation The variation graph toolkit (VG) represents genetic variation as a graph. Although
each path in the graph is a potential haplotype, most paths are non-biological, unlikely …

Bacterial genomic epidemiology with mixed samples

T Mäklin, T Kallonen, J Alanko… - Microbial …, 2021 - microbiologyresearch.org
Genomic epidemiology is a tool for tracing transmission of pathogens based on whole-
genome sequencing. We introduce the mGEMS pipeline for genomic epidemiology with …

On the complexity of sequence-to-graph alignment

C Jain, H Zhang, Y Gao, S Aluru - Journal of Computational Biology, 2020 - liebertpub.com
Availability of extensive genetic data across multiple individuals and populations is driving
the growing importance of graph-based reference representations. Aligning sequences to …

[PDF][PDF] On the complexity of string matching for graphs

M Equi, R Grossi, V Mäkinen… - International …, 2019 - researchportal.helsinki.fi
Exact string matching in labeled graphs is the problem of searching paths of a graph G=(V,
E) such that the concatenation of their node labels is equal to the given pattern string P [1 …

On indexing and compressing finite automata

N Cotumaccio, N Prezza - Proceedings of the 2021 ACM-SIAM Symposium on …, 2021 - SIAM
An index for a finite automaton is a powerful data structure that supports locating paths
labeled with a query pattern, thus solving pattern matching on the underlying regular …

Small Searchable κ-Spectra via Subset Rank Queries on the Spectral Burrows-Wheeler Transform

JN Alanko, SJ Puglisi, J Vuohtoniemi - SIAM Conference on Applied and …, 2023 - SIAM
The κ-spectrum of a string is the set of all distinct substrings of length κ occurring in the
string. This is a lossy but computationally convenient representation of the information in the …

[HTML][HTML] Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless SETH fails

M Equi, V Mäkinen, AI Tomescu - Theoretical Computer Science, 2023 - Elsevier
The string matching problem on a node-labeled graph G=(V, E) asks whether a given
pattern string P equals the concatenation of node labels of some path in G. This is a basic …

Graphs can be succinctly indexed for pattern matching in time

N Cotumaccio - 2022 Data Compression Conference (DCC), 2022 - ieeexplore.ieee.org
For the first time we provide a succinct pattern matching index for arbitrary graphs that can
be built in polynomial time, while improving both space and query time bounds from SODA …