Parallel computing for genome sequence processing
The rapid increase of genome data brought by gene sequencing technologies poses a
massive challenge to data processing. To solve the problems caused by enormous data and …
massive challenge to data processing. To solve the problems caused by enormous data and …
Searching for repetitions in biological networks: methods, resources and tools
We present here a compact overview of the data, models and methods proposed for the
analysis of biological networks based on the search for significant repetitions. In particular …
analysis of biological networks based on the search for significant repetitions. In particular …
SIESTA: A scalable infrastructure of sequential pattern analysis
I Mavroudopoulos, A Gounaris - IEEE Transactions on Big Data, 2022 - ieeexplore.ieee.org
Sequential pattern analysis has become a mature topic with a lot of techniques for a variety
of sequential pattern mining-related problems. Moreover, tailored solutions for specific …
of sequential pattern mining-related problems. Moreover, tailored solutions for specific …
Pangenome comparison via ED strings
E Gabory, MN Mwaniki, N Pisanti, SP Pissis… - Frontiers in …, 2024 - frontiersin.org
Introduction An elastic-degenerate (ED) string is a sequence of sets of strings. It can also be
seen as a directed acyclic graph whose edges are labeled by strings. The notion of ED …
seen as a directed acyclic graph whose edges are labeled by strings. The notion of ED …
[PDF][PDF] Sequence detection in event log files.
Sequential pattern analysis has become a mature topic, with a lot of techniques for a variety
of sequential pattern mining-related problems. Moreover, tailored solutions for specific …
of sequential pattern mining-related problems. Moreover, tailored solutions for specific …
Parallel motif extraction from very long sequences
Motifs are frequent patterns used to identify biological functionality in genomic sequences,
periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that …
periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that …
ACME: A scalable parallel system for extracting frequent patterns from a very long sequence
Modern applications, including bioinformatics, time series, and web log analysis, require the
extraction of frequent patterns, called motifs, from one very long (ie, several gigabytes) …
extraction of frequent patterns, called motifs, from one very long (ie, several gigabytes) …
Searching for compact hierarchical structures in DNA by means of the Smallest Grammar Problem
M Gallé - 2011 - theses.hal.science
Motivated by the goal of discovering hierarchical structures inside DNA sequences, we
address the Smallest Grammar Problem, the problem of finding a smallest context-free …
address the Smallest Grammar Problem, the problem of finding a smallest context-free …
Pattern masking for dictionary matching: theory and practice
Data masking is a common technique for sanitizing sensitive data maintained in database
systems which is becoming increasingly important in various application areas, such as in …
systems which is becoming increasingly important in various application areas, such as in …
[HTML][HTML] Irredundant tandem motifs
Eliminating the possible redundancy from a set of candidate motifs occurring in an input
string is fundamental in many applications. The existing techniques proposed to extract …
string is fundamental in many applications. The existing techniques proposed to extract …