Machine learning-guided protein engineering
Recent progress in engineering highly promising biocatalysts has increasingly involved
machine learning methods. These methods leverage existing experimental and simulation …
machine learning methods. These methods leverage existing experimental and simulation …
Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution
Genomic (DNA) sequences encode an enormous amount of information for gene regulation
and protein synthesis. Similar to natural language models, researchers have proposed …
and protein synthesis. Similar to natural language models, researchers have proposed …
Dnabert-2: Efficient foundation model and benchmark for multi-species genome
Decoding the linguistic intricacies of the genome is a crucial problem in biology, and pre-
trained foundational models such as DNABERT and Nucleotide Transformer have made …
trained foundational models such as DNABERT and Nucleotide Transformer have made …
To transformers and beyond: large language models for the genome
In the rapidly evolving landscape of genomics, deep learning has emerged as a useful tool
for tackling complex computational challenges. This review focuses on the transformative …
for tackling complex computational challenges. This review focuses on the transformative …
Caduceus: Bi-directional equivariant long-range dna sequence modeling
Large-scale sequence modeling has sparked rapid advances that now extend into biology
and genomics. However, modeling genomic sequences introduces challenges such as the …
and genomics. However, modeling genomic sequences introduces challenges such as the …
Simple linear attention language models balance the recall-throughput tradeoff
Recent work has shown that attention-based language models excel at recall, the ability to
ground generations in tokens previously seen in context. However, the efficiency of attention …
ground generations in tokens previously seen in context. However, the efficiency of attention …
Bend: Benchmarking dna language models on biologically meaningful tasks
The genome sequence contains the blueprint for governing cellular processes. While the
availability of genomes has vastly increased over the last decades, experimental annotation …
availability of genomes has vastly increased over the last decades, experimental annotation …
Progress and Opportunities of Foundation Models in Bioinformatics
Bioinformatics has witnessed a paradigm shift with the increasing integration of artificial
intelligence (AI), particularly through the adoption of foundation models (FMs). These AI …
intelligence (AI), particularly through the adoption of foundation models (FMs). These AI …
DiscDiff: Latent Diffusion Model for DNA Sequence Generation
This paper introduces a novel framework for DNA sequence generation, comprising two key
components: DiscDiff, a Latent Diffusion Model (LDM) tailored for generating discrete DNA …
components: DiscDiff, a Latent Diffusion Model (LDM) tailored for generating discrete DNA …
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
Y Ren, Z Chen, L Qiao, H Jing, Y Cai, S Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
RNA plays a pivotal role in translating genetic instructions into functional outcomes,
underscoring its importance in biological processes and disease mechanisms. Despite the …
underscoring its importance in biological processes and disease mechanisms. Despite the …