An extended de Bruijn graph for feature engineering over biological sequential data

MO Cakiroglu, H Kurban, P Sharma… - Machine Learning …, 2024 - singcf.iopscience.iop.org
In this study, we introduce a novel de Bruijn graph (dBG) based framework for feature
engineering in biological sequential data such as proteins. This framework simplifies feature …

Sequence-based data-constrained deep learning framework to predict spider dragline mechanical properties

A Pandey, W Chen, S Keten - Communications Materials, 2024 - nature.com
Spider dragline silk is known for its exceptional strength and toughness; hence
understanding the link between its primary sequence and mechanics is crucial. Here, we …

Mathematical programming in computational biology: an annotated bibliography

G Lancia - Algorithms, 2008 - mdpi.com
The field of computational biology has experienced a tremendous growth in the past 15
years. In this bibliography, we survey some of the most significant contributions that were …

LASAGNA: a novel algorithm for transcription factor binding site alignment

C Lee, CH Huang - BMC bioinformatics, 2013 - Springer
Background Scientists routinely scan DNA sequences for transcription factor (TF)
bindingsites (TFBSs). Most of the available tools rely on position-specific scoringmatrices …

NestedMICA as an ab initio protein motif discovery tool

M Doğruel, TA Down, TJP Hubbard - BMC bioinformatics, 2008 - Springer
Background Discovering overrepresented patterns in amino acid sequences is an important
step in protein functional element identification. We adapted and extended NestedMICA, an …

A closer look at the closest string and closest substring problem

M Chimani, M Woste, S Böcker - … Proceedings of the Thirteenth Workshop on …, 2011 - SIAM
Let S be a set of k strings over an alphabet Σ; each string has a length between ℓ and n. The
Closest Substring Problem (CSSP) is to find a minimal integer d (and a corresponding string …

DiversiTree: A New Method to Efficiently Compute Diverse Sets of Near-Optimal Solutions to Mixed-Integer Optimization Problems

I Ahanor, H Medal, AC Trapp - INFORMS Journal on …, 2024 - pubsonline.informs.org
Although most methods for solving mixed-integer optimization problems compute a single
optimal solution, a diverse set of near-optimal solutions can often lead to improved …

DNA motif finding method without protection can leak user privacy

X Wu, H Wang, M Shi, A Wang, K Xia - IEEE Access, 2019 - ieeexplore.ieee.org
DNA sequence analysis plays an important role in the study of gene regulatory networks.
DNA motif finding has become a key discipline in the post-gene era and gradually become a …

Binding site graphs: a new graph theoretical framework for prediction of transcription factor binding sites

TE Reddy, C DeLisi… - PLoS computational …, 2007 - journals.plos.org
Computational prediction of nucleotide binding specificity for transcription factors remains a
fundamental and largely unsolved problem. Determination of binding positions is a …

M are better than one: an ensemble-based motif finder and its application to regulatory element prediction

C Yanover, M Singh, E Zaslavsky - Bioinformatics, 2009 - academic.oup.com
Motivation: Identifying regulatory elements in genomic sequences is a key component in
understanding the control of gene expression. Computationally, this problem is often …