Pangenome graphs
JM Eizenga, AM Novak, JA Sibbesen… - Annual review of …, 2020 - annualreviews.org
Low-cost whole-genome assembly has enabled the collection of haplotype-resolved
pangenomes for numerous organisms. In turn, this technological change is encouraging the …
pangenomes for numerous organisms. In turn, this technological change is encouraging the …
Computational graph pangenomics: a tutorial on data structures and their applications
Computational pangenomics is an emerging research field that is changing the way
computer scientists are facing challenges in biological sequence analysis. In past decades …
computer scientists are facing challenges in biological sequence analysis. In past decades …
Haplotype-aware graph indexes
Motivation The variation graph toolkit (VG) represents genetic variation as a graph. Although
each path in the graph is a potential haplotype, most paths are non-biological, unlikely …
each path in the graph is a potential haplotype, most paths are non-biological, unlikely …
Bacterial genomic epidemiology with mixed samples
Genomic epidemiology is a tool for tracing transmission of pathogens based on whole-
genome sequencing. We introduce the mGEMS pipeline for genomic epidemiology with …
genome sequencing. We introduce the mGEMS pipeline for genomic epidemiology with …
On the complexity of sequence-to-graph alignment
Availability of extensive genetic data across multiple individuals and populations is driving
the growing importance of graph-based reference representations. Aligning sequences to …
the growing importance of graph-based reference representations. Aligning sequences to …
[PDF][PDF] On the complexity of string matching for graphs
Exact string matching in labeled graphs is the problem of searching paths of a graph G=(V,
E) such that the concatenation of their node labels is equal to the given pattern string P [1 …
E) such that the concatenation of their node labels is equal to the given pattern string P [1 …
On indexing and compressing finite automata
N Cotumaccio, N Prezza - Proceedings of the 2021 ACM-SIAM Symposium on …, 2021 - SIAM
An index for a finite automaton is a powerful data structure that supports locating paths
labeled with a query pattern, thus solving pattern matching on the underlying regular …
labeled with a query pattern, thus solving pattern matching on the underlying regular …
Small Searchable κ-Spectra via Subset Rank Queries on the Spectral Burrows-Wheeler Transform
JN Alanko, SJ Puglisi, J Vuohtoniemi - SIAM Conference on Applied and …, 2023 - SIAM
The κ-spectrum of a string is the set of all distinct substrings of length κ occurring in the
string. This is a lossy but computationally convenient representation of the information in the …
string. This is a lossy but computationally convenient representation of the information in the …
[HTML][HTML] Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless SETH fails
The string matching problem on a node-labeled graph G=(V, E) asks whether a given
pattern string P equals the concatenation of node labels of some path in G. This is a basic …
pattern string P equals the concatenation of node labels of some path in G. This is a basic …
Graphs can be succinctly indexed for pattern matching in time
N Cotumaccio - 2022 Data Compression Conference (DCC), 2022 - ieeexplore.ieee.org
For the first time we provide a succinct pattern matching index for arbitrary graphs that can
be built in polynomial time, while improving both space and query time bounds from SODA …
be built in polynomial time, while improving both space and query time bounds from SODA …