作者
Korbinian Schneeberger, Stephan Ossowski, Felix Ott, Juliane D Klein, Xi Wang, Christa Lanz, Lisa M Smith, Jun Cao, Joffrey Fitz, Norman Warthmann, Stefan R Henz, Daniel H Huson, Detlef Weigel
发表日期
2011/6/21
期刊
Proceedings of the National Academy of Sciences
卷号
108
期号
25
页码范围
10249-10254
出版商
National Academy of Sciences
简介
We present whole-genome assemblies of four divergent Arabidopsis thaliana strains that complement the 125-Mb reference genome sequence released a decade ago. Using a newly developed reference-guided approach, we assembled large contigs from 9 to 42 Gb of Illumina short-read data from the Landsberg erecta (Ler-1), C24, Bur-0, and Kro-0 strains, which have been sequenced as part of the 1,001 Genomes Project for this species. Using alignments against the reference sequence, we first reduced the complexity of the de novo assembly and later integrated reads without similarity to the reference sequence. As an example, half of the noncentromeric C24 genome was covered by scaffolds that are longer than 260 kb, with a maximum of 2.2 Mb. Moreover, over 96% of the reference genome was covered by the reference-guided assembly, compared with only 87% with a complete de novo assembly …
引用总数
学术搜索中的文章
K Schneeberger, S Ossowski, F Ott, JD Klein, X Wang… - Proceedings of the National Academy of Sciences, 2011