Fast alignment-free sequence comparison using spaced-word frequencies

CA Leimeister, M Boden, S Horwege, S Lindner… - …, 2014 - academic.oup.com
Motivation: Alignment-free methods for sequence comparison are increasingly used for
genome analysis and phylogeny reconstruction; they circumvent various difficulties of …

kmacs: the k -mismatch average common substring approach to alignment-free sequence comparison

CA Leimeister, B Morgenstern - Bioinformatics, 2014 - academic.oup.com
Motivation: Alignment-based methods for sequence analysis have various limitations if large
datasets are to be analysed. Therefore, alignment-free approaches have become popular in …

Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches

S Horwege, S Lindner, M Boden, K Hatje… - Nucleic acids …, 2014 - academic.oup.com
In this article, we present a user-friendly web interface for two alignment-free sequence-
comparison methods that we recently developed. Most alignment-free methods rely on exact …

CMash: fast, multi-resolution estimation of k-mer-based Jaccard and containment indices

S Liu, D Koslicki - Bioinformatics, 2022 - academic.oup.com
Motivation K-mer-based methods are used ubiquitously in the field of computational biology.
However, determining the optimal value of k for a specific application often remains …

An effective extension of the applicability of alignment-free biological sequence comparison algorithms with Hadoop

G Cattaneo, UF Petrillo, R Giancarlo… - The Journal of …, 2017 - Springer
Alignment-free methods are one of the mainstays of biological sequence comparison, ie, the
assessment of how similar two biological sequences are to each other, a fundamental and …

Spanning tree based state encoding for low power dissipation

W Nöth, R Kolla - Proceedings of the conference on Design, automation …, 1999 - dl.acm.org
In this paper we address the problem of state encoding for synchronous finite state
machines. The primary goal is the reduction of switching activity in the state register. At the …

A Coverage Criterion for Spaced Seeds and Its Applications to Support Vector Machine String Kernels and k-Mer Distances

L Noé, DEK Martin - Journal of Computational Biology, 2014 - liebertpub.com
Spaced seeds have been recently shown to not only detect more alignments, but also to
give a more accurate measure of phylogenetic distances, and to provide a lower …

CStone: A de novo transcriptome assembler for short-read data that identifies non-chimeric contigs based on underlying graph structure

R Linheiro, J Archer - PLoS computational biology, 2021 - journals.plos.org
With the exponential growth of sequence information stored over the last decade, including
that of de novo assembled contigs from RNA-Seq experiments, quantification of chimeric …

Sweep: representing large biological sequences datasets in compact vectors

CR De Pierri, R Voyceik, LGC Santos de Mattos… - Scientific reports, 2020 - nature.com
Vectoral and alignment-free approaches to biological sequence representation have been
explored in bioinformatics to efficiently handle big data. Even so, most current methods …

Accurate multiple alignment of distantly related genome sequences using filtered spaced word matches as anchor points

CA Leimeister, T Dencker, B Morgenstern - Bioinformatics, 2019 - academic.oup.com
Motivation Most methods for pairwise and multiple genome alignment use fast local
homology search tools to identify anchor points, ie high-scoring local alignments of the input …