Counting kmers for biological sequences at large scale

J Ge, J Meng, N Guo, Y Wei, P Balaji… - Interdisciplinary Sciences …, 2020 - Springer
Counting the abundance of all the distinct kmers in biological sequence data is a
fundamental step in bioinformatics. These applications include de novo genome assembly …

ngia: A novel greedy incremental alignment based algorithm for gene sequence clustering

Z Ju, H Zhang, J Meng, J Zhang, J Fan, Y Pan… - Future Generation …, 2022 - Elsevier
Gene sequence clustering is very basic and important in computational biology and
bioinformatics for the study of phylogenetic relationships and gene function prediction, etc …

Pakman: a scalable algorithm for generating genomic contigs on distributed memory machines

P Ghosh, S Krishnamoorthy… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
De novo genome assembly is a fundamental problem in the field of bioinformatics, that aims
to assemble the DNA sequence of an unknown genome from numerous short DNA …

Performance characterization of de novo genome assembly on leading parallel systems

M Ellis, E Georganas, R Egan, S Hofmeyr… - Euro-Par 2017: Parallel …, 2017 - Springer
De novo genome assembly is one of the most important and challenging computational
problems in modern genomics; further, it shares algorithms and communication patterns …

Parallelizing big de bruijn graph construction on heterogeneous processors

S Qiu, Q Luo - 2017 IEEE 37th International Conference on …, 2017 - ieeexplore.ieee.org
De Bruijn graph construction is the first step in de novo assemblers to connect input reads
into a complete sequence without a reference genome. This step is both time and memory …

Pakman: Scalable assembly of large genomes on distributed memory machines

P Ghosh, S Krishnamoorthy… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
De novo genome assembly is a fundamental problem in the field of bioinformatics, that aims
to assemble the DNA sequence of an unknown genome from numerous short DNA …

Sora: Scalable overlap-graph reduction algorithms for genome assembly using apache spark in the cloud

AJ Paul, D Lawrence, M Song, SH Lim… - 2018 IEEE …, 2018 - ieeexplore.ieee.org
The advent of high-throughput DNA sequencing techniques has permitted very high quality
de novo assemblies of genomes, but raise an issue of requiring large amounts of computer …

Novel approaches for the exploitation of high throughput sequencing data

A Limasset - 2017 - hal.science
In this thesis we discuss computational methods to deal with DNA sequences provided by
high throughput sequencers. We will mostly focus on the reconstruction of genomes from …

Using Apache Spark on genome assembly for scalable overlap-graph reduction

AJ Paul, D Lawrence, M Song, SH Lim, C Pan, TH Ahn - Human genomics, 2019 - Springer
Background De novo genome assembly is a technique that builds the genome of a
specimen using overlaps of genomic fragments without additional work with reference …

[图书][B] Parallelizing de vovo Assembly with Heterogeneous Processors

S Qiu - 2019 - search.proquest.com
De novo assemblers construct genome sequences from small fragments, without using any
reference genome. Specifically, they represent the fragments in a De Bruijn graph and …