Transformer technology in molecular science

J Jiang, L Ke, L Chen, B Dou, Y Zhu… - Wiley …, 2024 - Wiley Online Library
A transformer is the foundational architecture behind large language models designed to
handle sequential data by using mechanisms of self‐attention to weigh the importance of …

Scaling laws for generative mixed-modal language models

A Aghajanyan, L Yu, A Conneau… - International …, 2023 - proceedings.mlr.press
Generative language models define distributions over sequences of tokens that can
represent essentially any combination of data modalities (eg, any permutation of image …

Retrosynthesis prediction with an interpretable deep-learning framework based on molecular assembly tasks

Y Wang, C Pang, Y Wang, J Jin, J Zhang… - Nature …, 2023 - nature.com
Automating retrosynthesis with artificial intelligence expedites organic chemistry research in
digital laboratories. However, most existing deep-learning approaches are hard to explain …

Enhancing activity prediction models in drug discovery with the ability to understand human language

P Seidl, A Vall, S Hochreiter… - … on Machine Learning, 2023 - proceedings.mlr.press
Activity and property prediction models are the central workhorses in drug discovery and
materials sciences, but currently, they have to be trained or fine-tuned for new tasks. Without …

Scientific large language models: A survey on biological & chemical domains

Q Zhang, K Ding, T Lyv, X Wang, Q Yin… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have emerged as a transformative power in enhancing
natural language comprehension, representing a significant stride toward artificial general …

Bayesian optimization of catalysts with in-context learning

MC Ramos, SS Michtavy, MD Porosoff… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) are able to do accurate classification with zero or only a few
examples (in-context learning). We show a prompting system that enables regression with …

Coati: Multimodal contrastive pretraining for representing and traversing chemical space

B Kaufman, EC Williams, C Underkoffler… - Journal of Chemical …, 2024 - ACS Publications
Creating a successful small molecule drug is a challenging multiparameter optimization
problem in an effectively infinite space of possible molecules. Generative models have …

Applications of Transformers in Computational Chemistry: Recent Progress and Prospects

R Wang, Y Ji, Y Li, ST Lee - The Journal of Physical Chemistry …, 2024 - ACS Publications
The powerful data processing and pattern recognition capabilities of machine learning (ML)
technology have provided technical support for the innovation in computational chemistry …

Regression with large language models for materials and molecular property prediction

R Jacobs, MP Polak, LE Schultz, H Mahdavi… - arXiv preprint arXiv …, 2024 - arxiv.org
We demonstrate the ability of large language models (LLMs) to perform material and
molecular property regression tasks, a significant deviation from the conventional LLM use …

Lost in Translation: Chemical Language Models and the Misunderstanding of Molecule Structures

V Ganeeva, A Sakhovskiy, K Khrabrov… - Findings of the …, 2024 - aclanthology.org
The recent integration of chemistry with natural language processing (NLP) has advanced
drug discovery. Molecule representation in language models (LMs) is crucial in enhancing …