作者
Marie Skovgaard, Lars Juhl Jensen, Søren Brunak, David Ussery, Anders Krogh
发表日期
2001/8/1
来源
TRENDS in Genetics
卷号
17
期号
8
页码范围
425-428
出版商
Elsevier
简介
In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300 genes, we show that it probably has only ∼3800 genes, and that a similar discrepancy exists for almost all published genomes.
引用总数
2000200120022003200420052006200720082009201020112012201320142015201620172018201920202021202220232024311525291819171414139912136937414311
学术搜索中的文章