A survey on evaluation of large language models
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …
industry, owing to their unprecedented performance in various applications. As LLMs …
Challenges and applications of large language models
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
Judging llm-as-a-judge with mt-bench and chatbot arena
Evaluating large language model (LLM) based chat assistants is challenging due to their
broad capabilities and the inadequacy of existing benchmarks in measuring human …
broad capabilities and the inadequacy of existing benchmarks in measuring human …
Llama 2: Open foundation and fine-tuned chat models
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large
language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine …
language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi
We introduce MMMU: a new benchmark designed to evaluate multimodal models on
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
Mixtral of experts
We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has
the same architecture as Mistral 7B, with the difference that each layer is composed of 8 …
the same architecture as Mistral 7B, with the difference that each layer is composed of 8 …
mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration
Abstract Multi-modal Large Language Models (MLLMs) have demonstrated impressive
instruction abilities across various open-ended tasks. However previous methods have …
instruction abilities across various open-ended tasks. However previous methods have …
Mistral 7B
We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for
superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all …
superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all …
What can large language models do in chemistry? a comprehensive benchmark on eight tasks
Abstract Large Language Models (LLMs) with strong abilities in natural language
processing tasks have emerged and have been applied in various kinds of areas such as …
processing tasks have emerged and have been applied in various kinds of areas such as …