[HTML][HTML] The language of proteins: NLP, machine learning & protein sequences
Natural language processing (NLP) is a field of computer science concerned with automated
text and language analysis. In recent years, following a series of breakthroughs in deep and …
text and language analysis. In recent years, following a series of breakthroughs in deep and …
Pre-trained language models in biomedical domain: A systematic survey
Pre-trained language models (PLMs) have been the de facto paradigm for most natural
language processing tasks. This also benefits the biomedical domain: researchers from …
language processing tasks. This also benefits the biomedical domain: researchers from …
Evolutionary-scale prediction of atomic-level protein structure with a language model
Recent advances in machine learning have leveraged evolutionary information in multiple
sequence alignments to predict protein structure. We demonstrate direct inference of full …
sequence alignments to predict protein structure. We demonstrate direct inference of full …
Learning inverse folding from millions of predicted structures
We consider the problem of predicting a protein sequence from its backbone atom
coordinates. Machine learning approaches to this problem to date have been limited by the …
coordinates. Machine learning approaches to this problem to date have been limited by the …
[HTML][HTML] Highly accurate protein structure prediction with AlphaFold
Proteins are essential to life, and understanding their structure can facilitate a mechanistic
understanding of their function. Through an enormous experimental effort 1, 2, 3, 4, the …
understanding of their function. Through an enormous experimental effort 1, 2, 3, 4, the …
Single-sequence protein structure prediction using a language model and deep learning
R Chowdhury, N Bouatta, S Biswas, C Floristean… - Nature …, 2022 - nature.com
AlphaFold2 and related computational systems predict protein structure using deep learning
and co-evolutionary relationships encoded in multiple sequence alignments (MSAs) …
and co-evolutionary relationships encoded in multiple sequence alignments (MSAs) …
ProteinBERT: a universal deep-learning model of protein sequence and function
Self-supervised deep language modeling has shown unprecedented success across natural
language tasks, and has recently been repurposed to biological sequences. However …
language tasks, and has recently been repurposed to biological sequences. However …
Protein remote homology detection and structural alignment using deep learning
Exploiting sequence–structure–function relationships in biotechnology requires improved
methods for aligning proteins that have low sequence similarity to previously annotated …
methods for aligning proteins that have low sequence similarity to previously annotated …
Prottrans: Toward understanding the language of life through self-supervised learning
Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …