[HTML][HTML] Test development, optimization and validation of a WGS pipeline for genetic disorders

Z Yang, X Yang, Y Sun, Y Wang, L Song, Z Qiao… - … Medical Genomics, 2023 - Springer
sequencing (MPS) technology, whole-genome sequencing (… , optimization, and validation
of the WGS pipeline for genetic … A set of best practices were recommended using our pipeline

[HTML][HTML] Genomics pipelines to investigate susceptibility in whole genome and exome sequenced data for variant discovery, annotation, prediction and genotyping

Z Ahmed, EG Renart, S Zeeshan - PeerJ, 2021 - peerj.com
DNA sequencing methodologies today are the whole-genome sequencing (WGS) and
whole-exome sequencing … the variant calling, by using the MarkDuplicate command-line tool from …

[HTML][HTML] Optimization of enzymatic fragmentation is crucial to maximize genome coverage: a comparison of library preparation methods for Illumina sequencing

T Ribarska, PM Bjørnstad, AYM Sundaram, GD Gilfillan - BMC genomics, 2022 - Springer
sequencing (NGS), offering cost-effective high precision data for a wide variety of applications
such as whole genome sequencing (… ), we configured MarkDuplicates to use a distance …

[HTML][HTML] OVarFlow: a resource optimized GATK 4 based Open source Variant calling workFlow

J Bathke, G Lühken - BMC bioinformatics, 2021 - Springer
… and indels) in whole genome sequencing data that can be … Furthermore, we are introducing
additional optimizations on … of too many GC threads for MarkDuplicates is consistent with …

Genomic diversity affects the accuracy of bacterial single-nucleotide polymorphism–calling pipelines

SJ Bush, D Foster, DW Eyre, EL Clark, N De Maio… - …, 2020 - academic.oup.com
… Finally, so as to approximate “out of the box” use conditions, we made a minimal-effort
application of each program with no attempt at species-specific optimization. Had we optimized …

[HTML][HTML] Optimizing performance of GATK workflows using Apache Arrow In-Memory data framework

T Ahmad, N Ahmed, Z Al-Ars, HP Hofstee - BMC genomics, 2020 - Springer
genome sequencing (WGS) and whole exome sequencing (… this application similar to the
MarkDuplicate application. All … intensive parts of genomics pipelines, but availability of such …

[HTML][HTML] Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels

M Zanti, K Michailidou, MA Loizidou, C Machattou… - BMC …, 2021 - Springer
sequencing data [37]. It is constantly evolving with a diversity of performance optimization
Duplicates were detected and removed using the Picard MarkDuplicates tool and read …

A comparative investigation of single nucleotide variant calling for a personal non-Caucasian sequencing sample

HS Park, JS Gim - Genes & Genomics, 2023 - Springer
… increasing clinical application of whole genome sequencing (WGS) lead … pipelines with and
without the markduplicate step. Precision and recall values with or without the markduplicate

Arrowsam: In-memory genomics data processing using apache arrow

T Ahmad, N Ahmed, J Peltenburg… - 2020 3rd International …, 2020 - ieeexplore.ieee.org
… This section provides a short description of DNA sequence … stack that can be hard to optimize
for performance. However, … O access of DNA sequencing data files) and MarkDuplicate to …

[HTML][HTML] Accelerating next generation sequencing data analysis: an evaluation of optimized best practices for Genome Analysis Toolkit algorithms

KR Franke, EL Crowgey - Genomics & informatics, 2020 - ncbi.nlm.nih.gov
… Parabricks was able to process a 50× whole-genome sequencing library in under 3 h and …
pipelines reduced runtimes even further to under 8 h with much faster BWA, MarkDuplicates, …