Alignment-free sequence analysis and applications
Genome and metagenome comparisons based on large amounts of next-generation
sequencing (NGS) data pose significant challenges for alignment-based approaches due to …
sequencing (NGS) data pose significant challenges for alignment-based approaches due to …
Genome-powered classification of microbial eukaryotes: focus on coral algal symbionts
Modern microbial taxonomy generally relies on the use of single marker genes or sets of
concatenated genes to generate a framework for the delineation and classification of …
concatenated genes to generate a framework for the delineation and classification of …
A network-based integrated framework for predicting virus–prokaryote interactions
Metagenomic sequencing has greatly enhanced the discovery of viral genomic sequences;
however, it remains challenging to identify the host (s) of these new viruses. We developed …
however, it remains challenging to identify the host (s) of these new viruses. We developed …
Variable number tandem repeats mediate the expression of proximal genes
M Bakhtiari, J Park, YC Ding, S Shleizer-Burko… - Nature …, 2021 - nature.com
Variable number tandem repeats (VNTRs) account for significant genetic variation in many
organisms. In humans, VNTRs have been implicated in both Mendelian and complex …
organisms. In humans, VNTRs have been implicated in both Mendelian and complex …
Predicting host taxonomic information from viral genomes: A comparison of feature representations
F Young, S Rogers, DL Robertson - PLoS computational biology, 2020 - journals.plos.org
The rise in metagenomics has led to an exponential growth in virus discovery. However, the
majority of these new virus sequences have no assigned host. Current machine learning …
majority of these new virus sequences have no assigned host. Current machine learning …
Lepidoptera genomes: current knowledge, gaps and future directions
DA Triant, SD Cinel, AY Kawahara - Current opinion in insect science, 2018 - Elsevier
Highlights•Despite being an ecologically diverse and speciose insect order, genomes are
available for< 10 of the 43 Lepidoptera superfamilies.•Genome-scale data are advancing …
available for< 10 of the 43 Lepidoptera superfamilies.•Genome-scale data are advancing …
The number of k-mer matches between two DNA sequences as a function of k and applications to estimate phylogenetic distances
S Röhling, A Linne, J Schellhorn, M Hosseini… - Plos one, 2020 - journals.plos.org
We study the number N k of length-k word matches between pairs of evolutionarily related
DNA sequences, as a function of k. We show that the Jukes-Cantor distance between two …
DNA sequences, as a function of k. We show that the Jukes-Cantor distance between two …
[HTML][HTML] Synonymous nucleotide changes drive papillomavirus evolution
KM King, EV Rajadhyaksha, IG Tobey… - Tumour Virus …, 2022 - Elsevier
Papillomaviruses have been evolving alongside their hosts for at least 450 million years.
This review will discuss some of the insights gained into the evolution of this diverse family …
This review will discuss some of the insights gained into the evolution of this diverse family …
[HTML][HTML] Enhancing metagenomic classification with compression-based features
JM Silva, JR Almeida - Artificial Intelligence in Medicine, 2024 - Elsevier
Metagenomics is a rapidly expanding field that uses next-generation sequencing technology
to analyze the genetic makeup of environmental samples. However, accurately identifying …
to analyze the genetic makeup of environmental samples. However, accurately identifying …
CMash: fast, multi-resolution estimation of k-mer-based Jaccard and containment indices
S Liu, D Koslicki - Bioinformatics, 2022 - academic.oup.com
Motivation K-mer-based methods are used ubiquitously in the field of computational biology.
However, determining the optimal value of k for a specific application often remains …
However, determining the optimal value of k for a specific application often remains …