相关文章- 学术资源搜索

A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

被引用次数：849 相关文章所有 4 个版本

[PDF] arxiv.org

Evaluating large language models: A comprehensive survey

Z Guo, R Jin, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

被引用次数：59 相关文章所有 2 个版本

[PDF] techrxiv.org

A survey on large language models: Applications, challenges, limitations, and practical usage

MU Hadi, R Qureshi, A Shah, M Irfan, A Zafar… - Authorea …, 2023 - techrxiv.org

Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

被引用次数：112 相关文章所有 2 个版本

[PDF] techrxiv.org

Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

MU Hadi, R Qureshi, A Shah, M Irfan, A Zafar… - Authorea …, 2023 - techrxiv.org

Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

被引用次数：80 相关文章所有 5 个版本

[PDF] researchgate.net

[PDF][PDF] Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng… - arXiv preprint arXiv …, 2023 - researchgate.net

Abstract Large Language Models (LLMs) have demonstrated remarkable capabilities in
important tasks such as natural language understanding, language generation, and …

被引用次数：45 相关文章所有 7 个版本

[PDF] arxiv.org

Dyval: Graph-informed dynamic evaluation of large language models

K Zhu, J Chen, J Wang, NZ Gong, D Yang… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have achieved remarkable performance in various
evaluation benchmarks. However, concerns about their performance are raised on potential …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Benchmarking llms via uncertainty quantification

F Ye, M Yang, J Pang, L Wang, DF Wong… - arXiv preprint arXiv …, 2024 - arxiv.org

The proliferation of open-source Large Language Models (LLMs) from various institutions
has highlighted the urgent need for comprehensive evaluation methods. However, current …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Are large language model-based evaluators the solution to scaling up multilingual evaluation?

R Hada, V Gumma, A de Wynter, H Diddee… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated impressive performance on Natural
Language Processing (NLP) tasks, such as Question Answering, Summarization, and …

被引用次数：20 相关文章所有 3 个版本

[PDF] arxiv.org

Towards an understanding of large language models in software engineering tasks

Z Zheng, K Ning, J Chen, Y Wang, W Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have drawn widespread attention and research due to their
astounding performance in tasks such as text generation and reasoning. Derivative …

被引用次数：27 相关文章所有 2 个版本

[PDF] arxiv.org

State of what art? a call for multi-prompt llm evaluation

M Mizrahi, G Kaplan, D Malkin, R Dror… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advances in large language models (LLMs) have led to the development of various
evaluation benchmarks. These benchmarks typically rely on a single instruction template for …

被引用次数：20 相关文章所有 4 个版本