From proteins to ligands: decoding deep learning methods for binding affinity prediction

R Gorantla, A Kubincova, AY Weiße… - Journal of Chemical …, 2023 - ACS Publications
Journal of Chemical Information and Modeling, 2023ACS Publications
Accurate in silico prediction of protein–ligand binding affinity is important in the early stages
of drug discovery. Deep learning-based methods exist but have yet to overtake more
conventional methods such as giga-docking largely due to their lack of generalizability. To
improve generalizability, we need to understand what these models learn from input protein
and ligand data. We systematically investigated a sequence-based deep learning
framework to assess the impact of protein and ligand encodings on predicting binding …
Accurate in silico prediction of protein–ligand binding affinity is important in the early stages of drug discovery. Deep learning-based methods exist but have yet to overtake more conventional methods such as giga-docking largely due to their lack of generalizability. To improve generalizability, we need to understand what these models learn from input protein and ligand data. We systematically investigated a sequence-based deep learning framework to assess the impact of protein and ligand encodings on predicting binding affinities for commonly used kinase data sets. The role of proteins is studied using convolutional neural network-based encodings obtained from sequences and graph neural network-based encodings enriched with structural information from contact maps. Ligand-based encodings are generated from graph-neural networks. We test different ligand perturbations by randomizing node and edge properties. For proteins, we make use of 3 different protein contact generation methods (AlphaFold2, Pconsc4, and ESM-1b) and compare these with a random control. Our investigation shows that protein encodings do not substantially impact the binding predictions, with no statistically significant difference in binding affinity for KIBA in the investigated metrics (concordance index, Pearson’s R Spearman’s Rank, and RMSE). Significant differences are seen for ligand encodings with random ligands and random ligand node properties, suggesting a much bigger reliance on ligand data for the learning tasks. Using different ways to combine protein and ligand encodings did not show a significant change in performance.
ACS Publications
以上显示的是最相近的搜索结果。 查看全部搜索结果