Simple LLM prompting is state-of-the-art for robust and multilingual dialogue evaluation

M Gao, X Hu, J Ruan, X Pu, X Wan - arXiv preprint arXiv:2402.01383, 2024 - arxiv.org

Evaluating natural language generation (NLG) is a vital but challenging problem in artificial
intelligence. Traditional evaluation metrics mainly capturing content (eg n-gram) overlap …

被引用次数：59 相关文章所有 2 个版本

[PDF] springer.com

A survey of dynamic graph neural networks

Y Zheng, L Yi, Z Wei - Frontiers of Computer Science, 2025 - Springer

Graph neural networks (GNNs) have emerged as a powerful tool for effectively mining and
learning from graph-structured data, with applications spanning numerous domains …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arXiv preprint arXiv …, 2024 - arxiv.org

Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

被引用次数：47 相关文章所有 2 个版本

[PDF] arxiv.org

Knowledge distillation of llm for education

E Latif, L Fang, P Ma, X Zhai - arXiv preprint arXiv:2312.15842, 2023 - arxiv.org

This study proposes a method for distilling the knowledge of fine-tuned Large Language
Models (LLMs) into a smaller, more efficient, and accurate neural network, specifically …

被引用次数：21 相关文章所有 2 个版本

[PDF] arxiv.org

Are LLM-based Evaluators Confusing NLG Quality Criteria?

X Hu, M Gao, S Hu, Y Zhang, Y Chen, T Xu… - arXiv preprint arXiv …, 2024 - arxiv.org

Some prior work has shown that LLMs perform well in NLG evaluation for different tasks.
However, we discover that LLMs seem to confuse different evaluation criteria, which reduces …

被引用次数：9 相关文章所有 2 个版本

UniDE: A multi-level and low-resource framework for automatic dialogue evaluation via LLM-based data augmentation and multitask learning

G Ye, H Zhao, Z Zhang, Z Jiang - Information Processing & Management, 2025 - Elsevier

Abstract Automatic Dialogue Evaluation (ADE) plays a vital role in developing dialogue and
interactive systems. However, when selecting quality dimensions, previous methods often …

[PDF] arxiv.org

Building a llama2-finetuned llm for odia language utilizing domain knowledge instruction set

GS Kohli, S Parida, S Sekhar, S Saha, NB Nair… - Proceedings of the …, 2023 - dl.acm.org

Building LLMs for languages other than English is in great demand due to the unavailability
and performance of multilingual LLMs, such as understanding the local context. The …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org