Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Machine learning for functional protein design

P Notin, N Rollins, Y Gal, C Sander, D Marks - Nature biotechnology, 2024 - nature.com
Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and
structure data have radically transformed computational protein design. New methods …

Convolutions are competitive with transformers for protein sequence pretraining

KK Yang, N Fusi, AX Lu - Cell Systems, 2024 - cell.com
Pretrained protein sequence language models have been shown to improve the
performance of many prediction tasks and are now routinely integrated into bioinformatics …

Scientific large language models: A survey on biological & chemical domains

Q Zhang, K Ding, T Lyv, X Wang, Q Yin… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have emerged as a transformative power in enhancing
natural language comprehension, representing a significant stride toward artificial general …

[HTML][HTML] Multimodal large language models in health care: Applications, challenges, and future outlook

R AlSaad, A Abd-Alrazaq, S Boughorbel… - Journal of medical …, 2024 - jmir.org
In the complex and multidimensional field of medicine, multimodal data are prevalent and
crucial for informed clinical decisions. Multimodal data span a broad spectrum of data types …

Prot2text: Multimodal protein's function generation with gnns and transformers

H Abdine, M Chatzianastasis, C Bouyioukos… - Proceedings of the …, 2024 - ojs.aaai.org
In recent years, significant progress has been made in this field of protein function prediction
with the development of various machine-learning approaches. However, most existing …

Biot5: Enriching cross-modal integration in biology with chemical knowledge and natural language associations

Q Pei, W Zhang, J Zhu, K Wu, K Gao, L Wu… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advancements in biological research leverage the integration of molecules, proteins,
and natural language to enhance drug discovery. However, current models exhibit several …

Prollama: A protein large language model for multi-task protein language processing

L Lv, Z Lin, H Li, Y Liu, J Cui, C Yu-Chian Chen… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Abstract Large Language Models (LLMs), including GPT-x and LLaMA2, have achieved
remarkable performance in multiple Natural Language Processing (NLP) tasks. Under the …

MatText: Do Language Models Need More than Text & Scale for Materials Modeling?

N Alampara, S Miret, KM Jablonka - arXiv preprint arXiv:2406.17295, 2024 - arxiv.org
Effectively representing materials as text has the potential to leverage the vast
advancements of large language models (LLMs) for discovering new materials. While LLMs …

A systematic survey in geometric deep learning for structure-based drug design

Z Zhang, J Yan, Q Liu, E Chen, M Zitnik - arXiv preprint arXiv:2306.11768, 2023 - arxiv.org
Structure-based drug design (SBDD) utilizes the three-dimensional geometry of proteins to
identify potential drug candidates. Traditional methods, grounded in physicochemical …