iDNA-ABT: advanced deep learning model for detecting DNA methylation with adaptive features and transductive information maximization

Y Yu, W He, J Jin, G Xiao, L Cui, R Zeng, L Wei - Bioinformatics, 2021 - academic.oup.com
Bioinformatics, 2021academic.oup.com
Motivation DNA methylation plays an important role in epigenetic modification, the
occurrence, and the development of diseases. Therefore, identification of DNA methylation
sites is critical for better understanding and revealing their functional mechanisms. To date,
several machine learning and deep learning methods have been developed for the
prediction of different DNA methylation types. However, they still highly rely on manual
features, which can largely limit the high-latent information extraction. Moreover, most of …
Motivation
DNA methylation plays an important role in epigenetic modification, the occurrence, and the development of diseases. Therefore, identification of DNA methylation sites is critical for better understanding and revealing their functional mechanisms. To date, several machine learning and deep learning methods have been developed for the prediction of different DNA methylation types. However, they still highly rely on manual features, which can largely limit the high-latent information extraction. Moreover, most of them are designed for one specific DNA methylation type, and therefore cannot predict multiple methylation sites in multiple species simultaneously. In this study, we propose iDNA-ABT, an advanced deep learning model that utilizes adaptive embedding based on Bidirectional Encoder Representations from Transformers (BERT) together with transductive information maximization (TIM).
Results
Benchmark results show that our proposed iDNA-ABT can automatically and adaptively learn the distinguishing features of biological sequences from multiple species, and thus perform significantly better than the state-of-the-art methods in predicting three different DNA methylation types. In addition, TIM loss is proven to be effective in dichotomous tasks via the comparison experiment. Furthermore, we verify that our features have strong adaptability and robustness to different species through comparison of adaptive embedding and six handcrafted feature encodings. Importantly, our model shows great generalization ability in different species, demonstrating that our model can adaptively capture the cross-species differences and improve the predictive performance. For the convenient use of our method, we further established an online webserver as the implementation of the proposed iDNA-ABT.
Availability and implementation
Our proposed iDNA-ABT and data are freely accessible via http://server.wei-group.net/iDNA_ABT and our source codes are available for downloading in the GitHub repository (https://github.com/YUYING07/iDNA_ABT).
Supplementary information
Supplementary data are available at Bioinformatics online.
Oxford University Press
以上显示的是最相近的搜索结果。 查看全部搜索结果