Codon language embeddings provide strong signals for use in protein engineering

C Outeiral, CM Deane - Nature Machine Intelligence, 2024 - nature.com
Protein representations from deep language models have yielded state-of-the-art
performance across many tasks in computational protein engineering. In recent years …

Learning the protein language: Evolution, structure, and function

T Bepler, B Berger - Cell systems, 2021 - cell.com
Language models have recently emerged as a powerful machine-learning approach for
distilling information from massive protein sequence databases. From readily available …

Protein language models meet reduced amino acid alphabets

I Ieremie, RM Ewing, M Niranjan - Bioinformatics, 2024 - academic.oup.com
Abstract Motivation Protein language models (PLMs), which borrowed ideas for modelling
and inference from natural language processing, have demonstrated the ability to extract …

Modeling protein using large-scale pretrain language model

Y Xiao, J Qiu, Z Li, CY Hsieh, J Tang - arXiv preprint arXiv:2108.07435, 2021 - arxiv.org
Protein is linked to almost every life process. Therefore, analyzing the biological structure
and property of protein sequences is critical to the exploration of life, as well as disease …

Progen2: exploring the boundaries of protein language models

E Nijkamp, JA Ruffolo, EN Weinstein, N Naik, A Madani - Cell systems, 2023 - cell.com
Attention-based models trained on protein sequences have demonstrated incredible
success at classification and generation tasks relevant for artificial-intelligence-driven …

Large language models generate functional protein sequences across diverse families

A Madani, B Krause, ER Greene, S Subramanian… - Nature …, 2023 - nature.com
Deep-learning language models have shown promise in various biotechnological
applications, including protein design and engineering. Here we describe ProGen, a …

Endowing protein language models with structural knowledge

D Chen, P Hartout, P Pellizzoni, C Oliver… - arXiv preprint arXiv …, 2024 - arxiv.org
Understanding the relationships between protein sequence, structure and function is a long-
standing biological challenge with manifold implications from drug design to our …

[PDF][PDF] Language models of protein sequences at the scale of evolution enable accurate structure prediction

Z Lin, H Akin, R Rao, B Hie, Z Zhu, W Lu… - BioRxiv, 2022 - biorxiv.org
Large language models have recently been shown to develop emergent capabilities with
scale, going beyond simple pattern matching to perform higher level reasoning and …

Ankh: Optimized protein language model unlocks general-purpose modelling

A Elnaggar, H Essam, W Salah-Eldin… - arXiv preprint arXiv …, 2023 - arxiv.org
As opposed to scaling-up protein language models (PLMs), we seek improving performance
via protein-specific optimization. Although the proportionality between the language model …

Structure-informed language models are protein designers

Z Zheng, Y Deng, D Xue, Y Zhou… - … on machine learning, 2023 - proceedings.mlr.press
This paper demonstrates that language models are strong structure-based protein
designers. We present LM-Design, a generic approach to reprogramming sequence-based …