作者
Anirban Dutta, Mohammed Monzoorul Haque, Tungadri Bose, Ch V Siva K Reddy, Sharmila S Mande
发表日期
2015/6/19
期刊
Journal of bioinformatics and computational biology
卷号
13
期号
03
页码范围
1541003
出版商
Imperial College Press
简介
Sequence data repositories archive and disseminate fastq data in compressed format. In spite of having relatively lower compression efficiency, data repositories continue to prefer GZIP over available specialized fastq compression algorithms. Ease of deployment, high processing speed and portability are the reasons for this preference. This study presents FQC, a fastq compression method that, in addition to providing significantly higher compression gains over GZIP, incorporates features necessary for universal adoption by data repositories/end-users. This study also proposes a novel archival strategy which allows sequence repositories to simultaneously store and disseminate lossless as well as (multiple) lossy variants of fastq files, without necessitating any additional storage requirements. For academic users, Linux, Windows, and Mac implementations (both 32 and 64-bit) of FQC are freely available for …
引用总数
20152016201720182019202020212022202320242211911331
学术搜索中的文章
A Dutta, MM Haque, T Bose, CVSK Reddy, SS Mande - Journal of bioinformatics and computational biology, 2015