The llama 3 herd of models
Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …
presents a new set of foundation models, called Llama 3. It is a herd of language models …
Ladder: Enabling Efficient {Low-Precision} Deep Learning Computing through Hardware-aware Tensor Transformation
The increasing demand for improving deep learning model performance has led to a
paradigm shift in supporting low-precision computation to harness the robustness of deep …
paradigm shift in supporting low-precision computation to harness the robustness of deep …
Language models scale reliably with over-training and on downstream tasks
Scaling laws are useful guides for derisking expensive training runs, as they predict
performance of large models using cheaper, small-scale experiments. However, there …
performance of large models using cheaper, small-scale experiments. However, there …
Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving
Solving mathematical problems requires advanced reasoning abilities and presents notable
challenges for large language models. Previous works usually synthesize data from …
challenges for large language models. Previous works usually synthesize data from …
Smart parallel automated cryo-electron tomography
F Eisenstein, Y Fukuda, R Danev - Nature Methods, 2024 - nature.com
In situ cryo-electron tomography enables investigation of macromolecules in their native
cellular environment. Samples have become more readily available owing to recent …
cellular environment. Samples have become more readily available owing to recent …
Allo: A Programming Model for Composable Accelerator Design
Special-purpose hardware accelerators are increasingly pivotal for sustaining performance
improvements in emerging applications, especially as the benefits of technology scaling …
improvements in emerging applications, especially as the benefits of technology scaling …
[HTML][HTML] Advancing state of health estimation for electric vehicles: Transformer-based approach leveraging real-world data
K Nakano, S Vögler, K Tanaka - Advances in Applied Energy, 2024 - Elsevier
The widespread adoption of electric vehicles (EVs) underscores the urgent need for
innovative approaches to estimate their lithium-ion batteries' state of health (SOH), which is …
innovative approaches to estimate their lithium-ion batteries' state of health (SOH), which is …
Eliminating position bias of language models: A mechanistic approach
Position bias has proven to be a prevalent issue of modern language models (LMs), where
the models prioritize content based on its position within the given context. This bias often …
the models prioritize content based on its position within the given context. This bias often …
Liger Kernel: Efficient Triton Kernels for LLM Training
Training Large Language Models (LLMs) efficiently at scale presents a formidable
challenge, driven by their ever-increasing computational demands and the need for …
challenge, driven by their ever-increasing computational demands and the need for …
Efficient training of large language models on distributed infrastructures: A survey
Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with
their sophisticated capabilities. Training these models requires vast GPU clusters and …
their sophisticated capabilities. Training these models requires vast GPU clusters and …