GSC: efficient lossless compression of VCF files with fast query

X Luo, Y Chen, L Liu, L Ding, Y Li, S Li, Y Zhang… - …, 2024 - academic.oup.com
Background With the rise of large-scale genome sequencing projects, genotyping of
thousands of samples has produced immense variant call format (VCF) files. It is becoming …

Biostatistical Aspects of Whole Genome Sequencing Studies: Preprocessing and Quality Control

RO Betschart, C Riccio, D Aguilera‐Garcia… - Biometrical …, 2024 - Wiley Online Library
Rapid advances in high‐throughput DNA sequencing technologies have enabled large‐
scale whole genome sequencing (WGS) studies. Before performing association analysis …

[HTML][HTML] GBC: a parallel toolkit based on highly addressable byte-encoding blocks for extremely large-scale genotypes of species

L Zhang, Y Yuan, W Peng, B Tang, MJ Li, H Gui… - Genome biology, 2023 - Springer
Whole-genome sequencing projects of millions of subjects contain enormous genotypes,
entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit …

Analysis-ready VCF at Biobank scale using Zarr

EA Czech, TR Millar, TE White, B Jeffery, A Miles… - bioRxiv, 2024 - biorxiv.org
Background: Variant Call Format (VCF) is the standard file format for interchanging genetic
variation data and associated quality control metrics. The usual row-wise encoding of the …

On Next-Generation Sequencing Compression via Multi-GPU

P De Luca, A Di Mauro, S Fiscale - International Symposium on Intelligent …, 2021 - Springer
In the last decades, the human genoma analysis for addressing health-care problems, has
widely grown. With the high throughput of biological data and, needing of represent them …

Genozip 14-advances in compression of BAM and CRAM files

D Lan, B Llamas - bioRxiv, 2022 - biorxiv.org
Genozip performs compression of a wide range of genomic data, including widely used
FASTQ, BAM and VCF file formats. Here, we introduce the latest advancement in Genozip …

An Abnormal Gene Detection Method Based on Selene

Q Zhang, Y Jiang - Intelligent Computing Theories and Application: 17th …, 2021 - Springer
Abstract When screening abnormal genes [7](such as cancer genes), it is very difficult to
only rely on the experience of bioinformatics scientists, so we usually use deep learning …

[PDF][PDF] Block Based Compression and Encryption of Genetic Information in VCF Files for Controlled Access

T Gautot, B Macq - dial.uclouvain.be
In recent years, the development of new technologies has brought down the cost of genome
sequencing by several orders of magnitude. This has ushered in a new era of genomic …