HLAB: learning the BiLSTM features from the ProtBert-encoded proteins for the class I HLA-peptide binding prediction

Y Zhang, G Zhu, K Li, F Li, L Huang… - Briefings in …, 2022 - academic.oup.com
Y Zhang, G Zhu, K Li, F Li, L Huang, M Duan, F Zhou
Briefings in Bioinformatics, 2022academic.oup.com
Abstract Human Leukocyte Antigen (HLA) is a type of molecule residing on the surfaces of
most human cells and exerts an essential role in the immune system responding to the
invasive items. The T cell antigen receptors may recognize the HLA-peptide complexes on
the surfaces of cancer cells and destroy these cancer cells through toxic T lymphocytes. The
computational determination of HLA-binding peptides will facilitate the rapid development of
cancer immunotherapies. This study hypothesized that the natural language processing …
Abstract
Human Leukocyte Antigen (HLA) is a type of molecule residing on the surfaces of most human cells and exerts an essential role in the immune system responding to the invasive items. The T cell antigen receptors may recognize the HLA-peptide complexes on the surfaces of cancer cells and destroy these cancer cells through toxic T lymphocytes. The computational determination of HLA-binding peptides will facilitate the rapid development of cancer immunotherapies. This study hypothesized that the natural language processing-encoded peptide features may be further enriched by another deep neural network. The hypothesis was tested with the Bi-directional Long Short-Term Memory-extracted features from the pretrained Protein Bidirectional Encoder Representations from Transformers-encoded features of the class I HLA (HLA-I)-binding peptides. The experimental data showed that our proposed HLAB feature engineering algorithm outperformed the existing ones in detecting the HLA-I-binding peptides. The extensive evaluation data show that the proposed HLAB algorithm outperforms all the seven existing studies on predicting the peptides binding to the HLA-A*01:01 allele in AUC and achieves the best average AUC values on the six out of the seven k-mers (k=8,9,...,14, respectively represent the prediction task of a polypeptide consisting of k amino acids) except for the 9-mer prediction tasks. The source code and the fine-tuned feature extraction models are available at http://www.healthinformaticslab.org/supp/resources.php.
Oxford University Press
以上显示的是最相近的搜索结果。 查看全部搜索结果