作者
Yu Peng, Henry CM Leung, Siu-Ming Yiu, Francis YL Chin
发表日期
2010
研讨会论文
Research in Computational Molecular Biology: 14th Annual International Conference, RECOMB 2010, Lisbon, Portugal, April 25-28, 2010. Proceedings 14
页码范围
426-440
出版商
Springer Berlin Heidelberg
简介
The de Bruijn graph assembly approach breaks reads into k-mers before assembling them into contigs. The string graph approach forms contigs by connecting two reads with k or more overlapping nucleotides. Both approaches must deal with the following problems: false-positive vertices, due to erroneous reads; gap problem, due to non-uniform coverage; branching problem, due to erroneous reads and repeat regions. A proper choice of k is crucial but for single k there is always a trade-off: a small k favors the situation of erroneous reads and non-uniform coverage, and a large k favors short repeat regions.
We propose an iterative de Bruijn graph approach iterating from small to large k exploring the advantages of the in between values. Our IDBA outperforms the existing algorithms by constructing longer contigs with similar accuracy and using less memory, both with real and simulated data. The …
引用总数
201020112012201320142015201620172018201920202021202220232024291523352336383839282838226
学术搜索中的文章
Y Peng, HCM Leung, SM Yiu, FYL Chin - Research in Computational Molecular Biology: 14th …, 2010