A survey on evaluation of large language models
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …
industry, owing to their unprecedented performance in various applications. As LLMs …
Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI
Abstract Generative Artificial Intelligence is set to revolutionize healthcare delivery by
transforming traditional patient care into a more personalized, efficient, and proactive …
transforming traditional patient care into a more personalized, efficient, and proactive …
M3exam: A multilingual, multimodal, multilevel benchmark for examining large language models
Despite the existence of various benchmarks for evaluating natural language processing
models, we argue that human exams are a more suitable means of evaluating general …
models, we argue that human exams are a more suitable means of evaluating general …
Yi: Open foundation models by 01. ai
We introduce the Yi model family, a series of language and multimodal models that
demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and …
demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and …
Llama beyond english: An empirical study on language capability transfer
In recent times, substantial advancements have been witnessed in large language models
(LLMs), exemplified by ChatGPT, showcasing remarkable proficiency across a range of …
(LLMs), exemplified by ChatGPT, showcasing remarkable proficiency across a range of …
Cmb: A comprehensive medical benchmark in chinese
Large Language Models (LLMs) provide a possibility to make a great breakthrough in
medicine. The establishment of a standardized medical benchmark becomes a fundamental …
medicine. The establishment of a standardized medical benchmark becomes a fundamental …
Scientific large language models: A survey on biological & chemical domains
Large Language Models (LLMs) have emerged as a transformative power in enhancing
natural language comprehension, representing a significant stride toward artificial general …
natural language comprehension, representing a significant stride toward artificial general …
Lawbench: Benchmarking legal knowledge of large language models
Large language models (LLMs) have demonstrated strong capabilities in various aspects.
However, when applying them to the highly specialized, safe-critical legal domain, it is …
However, when applying them to the highly specialized, safe-critical legal domain, it is …
Nphardeval: Dynamic benchmark on reasoning ability of large language models via complexity classes
Complex reasoning ability is one of the most important features of current LLMs, which has
also been leveraged to play an integral role in complex decision-making tasks. Therefore …
also been leveraged to play an integral role in complex decision-making tasks. Therefore …
Multilingual large language model: A survey of resources, taxonomy and frontiers
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …
Models to handle and respond to queries in multiple languages, which achieves remarkable …