作者
Rayan Chikhi, Paul Medvedev
发表日期
2014/1/1
期刊
Bioinformatics
卷号
30
期号
1
页码范围
31-37
出版商
Oxford University Press
简介
Motivation: Genome assembly tools based on the de Bruijn graph framework rely on a parameter k, which represents a trade-off between several competing effects that are difficult to quantify. There is currently a lack of tools that would automatically estimate the best k to use and/or quickly generate histograms of k-mer abundances that would allow the user to make an informed decision.
Results: We develop a fast and accurate sampling method that constructs approximate abundance histograms with several orders of magnitude performance improvement over traditional methods. We then present a fast heuristic that uses the generated abundance histograms for putative k values to estimate the best possible value of k. We test the effectiveness of our tool using diverse sequencing datasets and find that its choice of k leads to some of the best assemblies.
Availability: Our tool KmerG …
引用总数
201320142015201620172018201920202021202220232024425368110084937489747937