An extended de Bruijn graph for feature engineering over biological sequential data

MO Cakiroglu, H Kurban, P Sharma… - Machine Learning …, 2024 - iopscience.iop.org
In this study, we introduce a novel de Bruijn graph (dBG) based framework for feature
engineering in biological sequential data such as proteins. This framework simplifies feature …

Aligning and clustering patterns to reveal the protein functionality of sequences

AKC Wong, ESA Lee - IEEE/ACM Transactions on …, 2014 - ieeexplore.ieee.org
Discovering sequence patterns with variations unveils significant functions of a protein
family. Existing combinatorial methods of discovering patterns with variations are …

Ranking and compacting binding segments of protein families using aligned pattern clusters

ESA Lee, AKC Wong - Proteome science, 2013 - Springer
Background Discovering sequence patterns with variation can unveil functions of a protein
family that are important for drug discovery. Exploring protein families using existing …

A graph-theoretical approach for motif discovery in protein sequences

E Czeizler, T Hirvola, K Karhu - IEEE/ACM transactions on …, 2015 - ieeexplore.ieee.org
Motif recognition is a challenging problem in bioinformatics due to the diversity of protein
motifs. Many existing algorithms identify motifs of a given length, thus being either not …

Identifying protein binding functionality of protein family sequences by Aligned Pattern clusters

ESA Lee, AKC Wong - 2012 IEEE International Conference on …, 2012 - ieeexplore.ieee.org
A basic task in protein analysis is to discover a set of sequence patterns that reflect the
function of a protein family. This set of sequence patterns contains non-exact significant …

字母重叠图的一些指标

杨荣, 杨兆兰, 张和平 - 计算机科学技术学报, 2012 - jcst.ict.ac.cn
无向de bruijn 图有很多好的性质, 比如较小的直径, 较大的顶点度等, 经常用来做通讯网络的
设计. 本文中, 我们考虑了字母重叠图G (k, d, s): 它的顶点集为V= v| v=(v 1... vk); vi∈ 1, 2,..., d …

Deriving safety properties of critical software from the system risk analysis, application to ground transportation systems

JL Boulanger, V Delebarre, S Natkin… - … 1997 High-Assurance …, 1997 - ieeexplore.ieee.org
Safety properties of critical software are consequences of the application safety properties
(ie the front collision of two trains must not occur), and of the system design choices. The …

A graph based approach to discover conserved regions in DNA and protein sequences

S Challa, P Thulasiraman - 21st International Conference on …, 2007 - ieeexplore.ieee.org
This paper attempts to provide a graph based approach to discover conserved regions such
as motifs in either DNA or Protein sequences. The motif discovery problem has gained lot of …

Synthesizing aligned random pattern digraphs from protein sequence patterns

AES Lee, AKC Wong - 2011 IEEE International Conference on …, 2011 - ieeexplore.ieee.org
An essential step of protein function analysis is to discover patterns that represent functional
regions in a set of protein family sequences. However, the same functional region of a …

Some indices of alphabet overlap graph

R Yang, ZL Yang, HP Zhang - Journal of Computer Science and …, 2012 - Springer
The undirected de Bruijn graph is often used as the model of communication network for its
useful properties, such as short diameter, small maximum vertex degree. In this paper, we …