Sequence Alignment/Map format: a comprehensive review of approaches and applications

Y Liu, X Shen, Y Gong, Y Liu, B Song… - Briefings in …, 2023 - academic.oup.com
Abstract The Sequence Alignment/Map (SAM) format file is the text file used to record
alignment information. Alignment is the core of sequencing analysis, and downstream tasks …

Data compression for sequencing data

S Deorowicz, S Grabowski - Algorithms for Molecular Biology, 2013 - Springer
Post-Sanger sequencing methods produce tons of data, and there is a generalagreement
that the challenge to store and process them must be addressedwith data compression. In …

Compression of FASTQ and SAM format sequencing data

JK Bonfield, MV Mahoney - PloS one, 2013 - journals.plos.org
Storage and transmission of the data produced by modern DNA sequencing instruments has
become a major concern, which prompted the Pistoia Alliance to pose the …

[PDF][PDF] Adam: Genomics formats and processing patterns for cloud scale computing

M Massie, F Nothaft, C Hartl, C Kozanitis… - … Technical Report, No …, 2013 - eecs.berkeley.edu
Current genomics data formats and processing pipelines are not designed to scale well to
large datasets. The current Sequence/Binary Alignment/Map (SAM/BAM) formats were …

A survey on data compression methods for biological sequences

M Hosseini, D Pratas, AJ Pinho - Information, 2016 - mdpi.com
The ever increasing growth of the production of high-throughput sequencing data poses a
serious challenge to the storage, processing and transmission of these data. As frequently …

MFCompress: a compression tool for FASTA and multi-FASTA data

AJ Pinho, D Pratas - Bioinformatics, 2014 - academic.oup.com
Motivation: The data deluge phenomenon is becoming a serious problem in most genomic
centers. To alleviate it, general purpose tools, such as gzip, are used to compress the data …

Compressing and reconstructing the voltage data for lithium-ion batteries using model migration and un-equidistant sampling techniques

X Tang, F Gao, X Lai - ETransportation, 2022 - Elsevier
The long-term storage of the batteries' operating data is critical to tracing and analysing their
historical use but challenged by the Trillions of bytes of raw data generated per day. For …

High-throughput DNA sequence data compression

Z Zhu, Y Zhang, Z Ji, S He, X Yang - Briefings in bioinformatics, 2015 - academic.oup.com
The exponential growth of high-throughput DNA sequence data has posed great challenges
to genomic data storage, retrieval and transmission. Compression is a critical tool to address …

Genomic data compression

M Hernaez, D Pavlichin, T Weissman… - Annual Review of …, 2019 - annualreviews.org
Recently, there has been growing interest in genome sequencing, driven by advances in
sequencing technology, in terms of both efficiency and affordability. These developments …

Genome compression: a novel approach for large collections

S Deorowicz, A Danek, S Grabowski - Bioinformatics, 2013 - academic.oup.com
Motivation: Genomic repositories are rapidly growing, as witnessed by the 1000 Genomes or
the UK10K projects. Hence, compression of multiple genomes of the same species has …