Achieving 80% ten‐fold cross‐validated accuracy for secondary structure prediction by large‐scale training

O Dor, Y Zhou - Proteins: Structure, Function, and …, 2007 - Wiley Online Library
Proteins: Structure, Function, and Bioinformatics, 2007Wiley Online Library
An integrated system of neural networks, called SPINE, is established and optimized for
predicting structural properties of proteins. SPINE is applied to three‐state secondary‐
structure and residue‐solvent‐accessibility (RSA) prediction in this paper. The integrated
neural networks are carefully trained with a large dataset of 2640 chains, sequence profiles
generated from multiple sequence alignment, representative amino acid properties, a slow
learning rate, overfitting protection, and an optimized sliding‐widow size. More than 200,000 …
Abstract
An integrated system of neural networks, called SPINE, is established and optimized for predicting structural properties of proteins. SPINE is applied to three‐state secondary‐structure and residue‐solvent‐accessibility (RSA) prediction in this paper. The integrated neural networks are carefully trained with a large dataset of 2640 chains, sequence profiles generated from multiple sequence alignment, representative amino acid properties, a slow learning rate, overfitting protection, and an optimized sliding‐widow size. More than 200,000 weights in SPINE are optimized by maximizing the accuracy measured by Q3 (the percentage of correctly classified residues). SPINE yields a 10‐fold cross‐validated accuracy of 79.5% (80.0% for chains of length between 50 and 300) in secondary‐structure prediction after one‐month (CPU time) training on 22 processors. An accuracy of 87.5% is achieved for exposed residues (RSA >95%). The latter approaches the theoretical upper limit of 88–90% accuracy in assigning secondary structures. An accuracy of 73% for three‐state solvent‐accessibility prediction (25%/75% cutoff) and 79.3% for two‐state prediction (25% cutoff) is also obtained. Proteins 2007. © 2006 Wiley‐Liss, Inc.
Wiley Online Library
以上显示的是最相近的搜索结果。 查看全部搜索结果