Deep learning enables high-quality and high-throughput prediction of enzyme commission numbers

JY Ryu, HU Kim, SY Lee - Proceedings of the National …, 2019 - National Acad Sciences
Proceedings of the National Academy of Sciences, 2019National Acad Sciences
High-quality and high-throughput prediction of enzyme commission (EC) numbers is
essential for accurate understanding of enzyme functions, which have many implications in
pathologies and industrial biotechnology. Several EC number prediction tools are currently
available, but their prediction performance needs to be further improved to precisely and
efficiently process an ever-increasing volume of protein sequence data. Here, we report
DeepEC, a deep learning-based computational framework that predicts EC numbers for …
High-quality and high-throughput prediction of enzyme commission (EC) numbers is essential for accurate understanding of enzyme functions, which have many implications in pathologies and industrial biotechnology. Several EC number prediction tools are currently available, but their prediction performance needs to be further improved to precisely and efficiently process an ever-increasing volume of protein sequence data. Here, we report DeepEC, a deep learning-based computational framework that predicts EC numbers for protein sequences with high precision and in a high-throughput manner. DeepEC takes a protein sequence as input and predicts EC numbers as output. DeepEC uses 3 convolutional neural networks (CNNs) as a major engine for the prediction of EC numbers, and also implements homology analysis for EC numbers that cannot be classified by the CNNs. Comparative analyses against 5 representative EC number prediction tools show that DeepEC allows the most precise prediction of EC numbers, and is the fastest and the lightest in terms of the disk space required. Furthermore, DeepEC is the most sensitive in detecting the effects of mutated domains/binding site residues of protein sequences. DeepEC can be used as an independent tool, and also as a third-party software component in combination with other computational platforms that examine metabolic reactions.
National Acad Sciences
以上显示的是最相近的搜索结果。 查看全部搜索结果