Language models for biological research: a primer

E Simon, K Swanson, J Zou - Nature Methods, 2024 - nature.com
Abstract Language models are playing an increasingly important role in many areas of
artificial intelligence (AI) and computational biology. In this primer, we discuss the ways in …

[HTML][HTML] Searching for the optimal microbial factory: high-throughput biosensors and analytical techniques for screening small molecules

E O'Connor, J Micklefield, Y Cai - Current Opinion in Biotechnology, 2024 - Elsevier
Highlights•Engineering for the bioproduction of natural products is bottlenecked by
screening.•To overcome this, we explore advancements in biosensors and analytical …

Are genomic language models all you need? exploring genomic language models on protein downstream tasks

S Boshar, E Trop, BP de Almeida, L Copoiu… - …, 2024 - academic.oup.com
Motivation Large language models, trained on enormous corpora of biological sequences,
are state-of-the-art for downstream genomic and proteomic tasks. Since the genome …

SegmentNT: annotating the genome at single-nucleotide resolution with DNA foundation models

BP de Almeida, H Dalla-Torre, G Richard, C Blum… - bioRxiv, 2024 - biorxiv.org
Foundation models have achieved remarkable success in several fields such as natural
language processing, computer vision and more recently biology. DNA foundation models in …

Machine learning for predicting protein properties: A comprehensive review

Y Wang, Y Zhang, X Zhan, Y He, Y Yang, L Cheng… - Neurocomputing, 2024 - Elsevier
In the field of protein engineering, the function and structure of proteins are key to
understanding cellular mechanisms, biological evolution, and biodiversity. With the …

Transformer model generated bacteriophage genomes are compositionally distinct from natural sequences

J Ratcliff - NAR Genomics and Bioinformatics, 2024 - academic.oup.com
Novel applications of language models in genomics promise to have a large impact on the
field. The megaDNA model is the first publicly available generative model for creating …

[HTML][HTML] Evaluating the representational power of pre-trained DNA language models for regulatory genomics

Z Tang, PK Koo - bioRxiv, 2024 - ncbi.nlm.nih.gov
The emergence of genomic language models (gLMs) offers an unsupervised approach to
learn a wide diversity of cis-regulatory patterns in the non-coding genome without requiring …

Using Machine Learning to Enhance and Accelerate Synthetic Biology

K Rai, Y Wang, RW O'Connell, AB Patel… - Current Opinion in …, 2024 - Elsevier
Engineering synthetic regulatory circuits with precise input-output behavior—a central goal
in synthetic biology—remains encumbered by the inherent molecular complexity of cells …

Advancing CRISPR base editing technology through innovative strategies and ideas

X Fan, Y Lei, L Wang, X Wu, D Li - Science China Life Sciences, 2024 - Springer
The innovation of CRISPR/Cas gene editing technology has developed rapidly in recent
years. It is widely used in the fields of disease animal model construction, biological …

Multi-modal Transfer Learning between Biological Foundation Models

JJ Garau-Luis, P Bordes, L Gonzalez, M Roller… - arXiv preprint arXiv …, 2024 - arxiv.org
Biological sequences encode fundamental instructions for the building blocks of life, in the
form of DNA, RNA, and proteins. Modeling these sequences is key to understand disease …