Big data in biology: the hope and present-day challenges in it

S Pal, S Mondal, G Das, S Khatua, Z Ghosh - Gene Reports, 2020 - Elsevier
The wave of new technologies has opened up the opportunity for cost-effective generation of
high-throughput profiles of biological systems. This is generating tons of biological data. It is …

High-throughput DNA sequence data compression

Z Zhu, Y Zhang, Z Ji, S He, X Yang - Briefings in bioinformatics, 2015 - academic.oup.com
The exponential growth of high-throughput DNA sequence data has posed great challenges
to genomic data storage, retrieval and transmission. Compression is a critical tool to address …

Efficient storage of high throughput DNA sequencing data using reference-based compression

MHY Fritz, R Leinonen, G Cochrane… - Genome research, 2011 - genome.cshlp.org
Data storage costs have become an appreciable proportion of total cost in the creation and
analysis of DNA sequence data. Of particular concern is that the rate of increase in DNA …

[PDF][PDF] DNACompress: fast and effective DNA sequence compression

X Chen, M Li, B Ma, J Tromp - Bioinformatics, 2002 - Citeseer
While achieving the best compression ratios for DNA sequences, our new DNACompress
program significantly improves the running time of all previous DNA compression programs …

A simple statistical algorithm for biological sequence compression

MD Cao, TI Dix, L Allison… - 2007 Data Compression …, 2007 - ieeexplore.ieee.org
This paper introduces a novel algorithm for biological sequence compression that makes
use of both statistical properties and repetition within sequences. A panel of experts is …

A survey on data compression methods for biological sequences

M Hosseini, D Pratas, AJ Pinho - Information, 2016 - mdpi.com
The ever increasing growth of the production of high-throughput sequencing data poses a
serious challenge to the storage, processing and transmission of these data. As frequently …

GReEn: a tool for efficient compression of genome resequencing data

AJ Pinho, D Pratas, SP Garcia - Nucleic acids research, 2012 - academic.oup.com
Research in the genomic sciences is confronted with the volume of sequencing and
resequencing data increasing at a higher pace than that of data storage and communication …

DNA sequence compression using adaptive particle swarm optimization-based memetic algorithm

Z Zhu, J Zhou, Z Ji, YH Shi - IEEE transactions on evolutionary …, 2011 - ieeexplore.ieee.org
With the rapid development of high-throughput DNA sequencing technologies, the amount
of DNA sequence data is accumulating exponentially. The huge influx of data creates new …

Textual data compression in computational biology: a synopsis

R Giancarlo, D Scaturro, F Utro - Bioinformatics, 2009 - academic.oup.com
Motivation: Textual data compression, and the associated techniques coming from
information theory, are often perceived as being of interest for data communication and …

Efficient DNA sequence compression with neural networks

M Silva, D Pratas, AJ Pinho - GigaScience, 2020 - academic.oup.com
Background The increasing production of genomic data has led to an intensified need for
models that can cope efficiently with the lossless compression of DNA sequences. Important …