Pan-genomics in the human genome era

RM Sherman, SL Salzberg - Nature Reviews Genetics, 2020 - nature.com
Since the early days of the genome era, the scientific community has relied on a single
'reference'genome for each species, which is used as the basis for a wide range of genetic …

Data structures based on k-mers for querying large collections of sequencing data sets

C Marchet, C Boucher, SJ Puglisi, P Medvedev… - Genome …, 2021 - genome.cshlp.org
High-throughput sequencing data sets are usually deposited in public repositories (eg, the
European Nucleotide Archive) to ensure reproducibility. As the amount of data has reached …

[HTML][HTML] Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs

G Holley, P Melsted - Genome biology, 2020 - Springer
Memory consumption of de Bruijn graphs is often prohibitive. Most de Bruijn graph-based
assemblers reduce the complexity by compacting paths into single vertices, but this is …

Ultrafast search of all deposited bacterial and viral genomic data

P Bradley, HC Den Bakker, EPC Rocha… - Nature …, 2019 - nature.com
Exponentially increasing amounts of unprocessed bacterial and viral genomic sequence
data are stored in the global archives. The ability to query these data for sequence search …

Themisto: a scalable colored k-mer index for sensitive pseudoalignment against hundreds of thousands of bacterial genomes

JN Alanko, J Vuohtoniemi, T Mäklin, SJ Puglisi - Bioinformatics, 2023 - academic.oup.com
Motivation Huge datasets containing whole-genome sequences of bacterial strains are now
commonplace and represent a rich and important resource for modern genomic …

[HTML][HTML] Metabolic framework of spontaneous and synthetic sourdough metacommunities to reveal microbial players responsible for resilience and performance

FM Calabrese, H Ameur, O Nikoloudaki, G Celano… - Microbiome, 2022 - Springer
Background In nature, microbial communities undergo changes in composition that threaten
their resiliency. Here, we interrogated sourdough, a natural cereal-fermenting …

[HTML][HTML] Current affairs of microbial genome-wide association studies: approaches, bottlenecks and analytical pitfalls

JE San, S Baichoo, A Kanzi, Y Moosa… - Frontiers in …, 2020 - frontiersin.org
Microbial genome-wide association studies (mGWAS) are a new and exciting research field
that is adapting human GWAS methods to understand how variations in microbial genomes …

[PDF][PDF] Mantis: a fast, small, and exact large-scale sequence-search index

P Pandey, F Almodaresi, MA Bender, M Ferdman… - Cell systems, 2018 - cell.com
Sequence-level searches on large collections of RNA sequencing experiments, such as the
NCBI Sequence Read Archive (SRA), would enable one to ask many questions about the …

[HTML][HTML] Genome-wide somatic variant calling using localized colored de Bruijn graphs

G Narzisi, A Corvelo, K Arora, EA Bergmann… - Communications …, 2018 - nature.com
Reliable detection of somatic variations is of critical importance in cancer research. Here we
present Lancet, an accurate and sensitive somatic variant caller, which detects SNVs and …

COBS: a compact bit-sliced signature index

T Bingmann, P Bradley, F Gauger, Z Iqbal - String Processing and …, 2019 - Springer
We present COBS, a COmpact Bit-sliced Signature index, which is a cross-over between an
inverted index and Bloom filters. Our target application is to index k-mers of DNA samples or …