Data compression for sequencing data
S Deorowicz, S Grabowski - Algorithms for Molecular Biology, 2013 - Springer
Post-Sanger sequencing methods produce tons of data, and there is a generalagreement
that the challenge to store and process them must be addressedwith data compression. In …
that the challenge to store and process them must be addressedwith data compression. In …
High-throughput DNA sequence data compression
The exponential growth of high-throughput DNA sequence data has posed great challenges
to genomic data storage, retrieval and transmission. Compression is a critical tool to address …
to genomic data storage, retrieval and transmission. Compression is a critical tool to address …
Robust relative compression of genomes with random access
S Deorowicz, S Grabowski - Bioinformatics, 2011 - academic.oup.com
Motivation: Storing, transferring and maintaining genomic databases becomes a major
challenge because of the rapid technology progress in DNA sequencing and …
challenge because of the rapid technology progress in DNA sequencing and …
A survey on data compression methods for biological sequences
The ever increasing growth of the production of high-throughput sequencing data poses a
serious challenge to the storage, processing and transmission of these data. As frequently …
serious challenge to the storage, processing and transmission of these data. As frequently …
Memory-efficient assembly using Flye
In the past decade, next-generation sequencing (NGS) enabled the generation of genomic
data in a cost-effective, high-throughput manner. The most recent third-generation …
data in a cost-effective, high-throughput manner. The most recent third-generation …
GReEn: a tool for efficient compression of genome resequencing data
Research in the genomic sciences is confronted with the volume of sequencing and
resequencing data increasing at a higher pace than that of data storage and communication …
resequencing data increasing at a higher pace than that of data storage and communication …
Computing MEMs and Relatives on Repetitive Text Collections
G Navarro - arXiv preprint arXiv:2210.09914, 2022 - arxiv.org
We consider the problem of computing the Maximal Exact Matches (MEMs) of a given
pattern $ P [1.. m] $ on a large repetitive text collection $ T [1.. n] $, which is represented as a …
pattern $ P [1.. m] $ on a large repetitive text collection $ T [1.. n] $, which is represented as a …
Efficient DNA sequence compression with neural networks
Background The increasing production of genomic data has led to an intensified need for
models that can cope efficiently with the lossless compression of DNA sequences. Important …
models that can cope efficiently with the lossless compression of DNA sequences. Important …
Iterative dictionary construction for compression of large DNA data sets
S Kuruppu, B Beresford-Smith… - … /ACM transactions on …, 2011 - ieeexplore.ieee.org
Genomic repositories increasingly include individual as well as reference sequences, which
tend to share long identical and near-identical strings of nucleotides. However, the …
tend to share long identical and near-identical strings of nucleotides. However, the …
FRESCO: Referential compression of highly similar sequences
In many applications, sets of similar texts or sequences are of high importance. Prominent
examples are revision histories of documents or genomic sequences. Modern high …
examples are revision histories of documents or genomic sequences. Modern high …