Computational graph pangenomics: a tutorial on data structures and their applications

JA Baaijens, P Bonizzoni, C Boucher… - Natural Computing, 2022 - Springer
Computational pangenomics is an emerging research field that is changing the way
computer scientists are facing challenges in biological sequence analysis. In past decades …

Hardware acceleration of genomics data analysis: challenges and opportunities

T Robinson, J Harkin, P Shukla - Bioinformatics, 2021 - academic.oup.com
The significant decline in the cost of genome sequencing has dramatically changed the
typical bioinformatics pipeline for analysing sequencing data. Where traditionally, the …

SVDSS: structural variation discovery in hard-to-call genomic regions using sample-specific strings from accurate long reads

L Denti, P Khorsand, P Bonizzoni, F Hormozdiari… - Nature …, 2023 - nature.com
Structural variants (SVs) account for a large amount of sequence variability across genomes
and play an important role in human genomics and precision medicine. Despite intense …

Representation of k-Mer Sets Using Spectrum-Preserving String Sets

A Rahman, P Medevedev - Journal of Computational Biology, 2021 - liebertpub.com
Given the popularity and elegance of k-mer-based tools, finding a space-efficient way to
represent a set of k-mers is important for improving the scalability of bioinformatics analyses …

The Statistics of k-mers from a Sequence Undergoing a Simple Mutation Process Without Spurious Matches

A Blanca, RS Harris, D Koslicki… - Journal of Computational …, 2022 - liebertpub.com
k-mer-based methods are widely used in bioinformatics, but there are many gaps in our
understanding of their statistical properties. Here, we consider the simple model where a …

Disk compression of k-mer sets

A Rahman, R Chikhi, P Medvedev - Algorithms for Molecular Biology, 2021 - Springer
K-mer based methods have become prevalent in many areas of bioinformatics. In
applications such as database search, they often work with large multi-terabyte-sized …

Sequencing technologies and analyses: where have we been and where are we going?

V Bansal, C Boucher - IScience, 2019 - cell.com
A wave of technologies transformed sequencing over a decade ago into the high-throughput
era, demanding research in new computational methods to analyze these data. The …

Kevlar: a mapping-free framework for accurate discovery of de novo variants

DS Standage, CT Brown, F Hormozdiari - Iscience, 2019 - cell.com
De novo genetic variants are an important source of causative variation in complex genetic
disorders. Many methods for variant discovery rely on mapping reads to a reference …

KAGE: fast alignment-free graph-based genotyping of SNPs and short indels

I Grytten, K Dagestad Rand, GK Sandve - Genome Biology, 2022 - Springer
Genotyping is a core application of high-throughput sequencing. We present KAGE, a
genotyper for SNPs and short indels that is inspired by recent developments within graph …

Pangenomic genotyping with the marker array

T Mun, NSK Vaddadi, B Langmead - Algorithms for Molecular Biology, 2023 - Springer
We present a new method and software tool called rowbowt that applies a pangenome
index to the problem of inferring genotypes from short-read sequencing data. The method …