Uncovering new families and folds in the natural protein universe

J Durairaj, AM Waterhouse, T Mets, T Brodiazhenko… - Nature, 2023 - nature.com
We are now entering a new era in protein sequence and structure annotation, with hundreds
of millions of predicted protein structures made available through the AlphaFold database …

Artificial intelligence for science in quantum, atomistic, and continuum systems

X Zhang, L Wang, J Helwig, Y Luo, C Fu, Y Xie… - arXiv preprint arXiv …, 2023 - arxiv.org
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural
sciences. Today, AI has started to advance natural sciences by improving, accelerating, and …

[HTML][HTML] Genetic manipulation of Patescibacteria provides mechanistic insights into microbial dark matter and the epibiotic lifestyle

Y Wang, LA Gallagher, PA Andrade, A Liu… - Cell, 2023 - cell.com
Patescibacteria, also known as the candidate phyla radiation (CPR), are a diverse group of
bacteria that constitute a disproportionately large fraction of microbial dark matter. Its few …

Large language models improve annotation of prokaryotic viral proteins

ZN Flamholz, SJ Biller, L Kelly - Nature Microbiology, 2024 - nature.com
Viral genomes are poorly annotated in metagenomic samples, representing an obstacle to
understanding viral diversity and function. Current annotation approaches rely on alignment …

Machine learning-aided design and screening of an emergent protein function in synthetic cells

S Kohyama, BP Frohn, L Babl, P Schwille - Nature Communications, 2024 - nature.com
Abstract Recently, utilization of Machine Learning (ML) has led to astonishing progress in
computational protein design, bringing into reach the targeted engineering of proteins for …

Latent generative landscapes as maps of functional diversity in protein sequence space

C Ziegler, J Martin, C Sinner, F Morcos - Nature Communications, 2023 - nature.com
Variational autoencoders are unsupervised learning models with generative capabilities,
when applied to protein data, they classify sequences by phylogeny and generate de novo …

Proteome-scale prediction of molecular mechanisms underlying dominant genetic diseases

M Badonyi, JA Marsh - Plos one, 2024 - journals.plos.org
Many dominant genetic disorders result from protein-altering mutations, acting primarily
through dominant-negative (DN), gain-of-function (GOF), and loss-of-function (LOF) …

Sensitive remote homology search by local alignment of small positional embeddings from protein language models

SR Johnson, M Peshwa, Z Sun - Elife, 2024 - elifesciences.org
Accurately detecting distant evolutionary relationships between proteins remains an
ongoing challenge in bioinformatics. Search methods based on primary sequence struggle …

What is hidden in the darkness? Deep-learning assisted large-scale protein family curation uncovers novel protein families and folds

J Durairaj, AM Waterhouse, T Mets, T Brodiazhenko… - bioRxiv, 2023 - biorxiv.org
Driven by the development and upscaling of fast genome sequencing and assembly
pipelines, the number of protein-coding sequences deposited in public protein sequence …

Neuropeptidomics of the American Lobster Homarus americanus

G Lu, VNH Tran, W Wu, M Ma, L Li - Journal of proteome research, 2024 - ACS Publications
The American lobster, Homarus americanus, is not only of considerable economic
importance but has also emerged as a premier model organism in neuroscience research …