Llm-based nlg evaluation: Current status and challenges
Evaluating natural language generation (NLG) is a vital but challenging problem in artificial
intelligence. Traditional evaluation metrics mainly capturing content (eg n-gram) overlap …
intelligence. Traditional evaluation metrics mainly capturing content (eg n-gram) overlap …
A survey of dynamic graph neural networks
Graph neural networks (GNNs) have emerged as a powerful tool for effectively mining and
learning from graph-structured data, with applications spanning numerous domains …
learning from graph-structured data, with applications spanning numerous domains …
Multilingual large language model: A survey of resources, taxonomy and frontiers
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …
Models to handle and respond to queries in multiple languages, which achieves remarkable …
Knowledge distillation of llm for education
This study proposes a method for distilling the knowledge of fine-tuned Large Language
Models (LLMs) into a smaller, more efficient, and accurate neural network, specifically …
Models (LLMs) into a smaller, more efficient, and accurate neural network, specifically …
Are LLM-based Evaluators Confusing NLG Quality Criteria?
Some prior work has shown that LLMs perform well in NLG evaluation for different tasks.
However, we discover that LLMs seem to confuse different evaluation criteria, which reduces …
However, we discover that LLMs seem to confuse different evaluation criteria, which reduces …
UniDE: A multi-level and low-resource framework for automatic dialogue evaluation via LLM-based data augmentation and multitask learning
G Ye, H Zhao, Z Zhang, Z Jiang - Information Processing & Management, 2025 - Elsevier
Abstract Automatic Dialogue Evaluation (ADE) plays a vital role in developing dialogue and
interactive systems. However, when selecting quality dimensions, previous methods often …
interactive systems. However, when selecting quality dimensions, previous methods often …
Building a llama2-finetuned llm for odia language utilizing domain knowledge instruction set
Building LLMs for languages other than English is in great demand due to the unavailability
and performance of multilingual LLMs, such as understanding the local context. The …
and performance of multilingual LLMs, such as understanding the local context. The …
ECoh: Turn-level Coherence Evaluation for Multilingual Dialogues
Despite being heralded as the new standard for dialogue evaluation, the closed-source
nature of GPT-4 poses challenges for the community. Motivated by the need for lightweight …
nature of GPT-4 poses challenges for the community. Motivated by the need for lightweight …
Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated Dialogues
KC Chu, YP Chen, H Nakayama - arXiv preprint arXiv:2407.09897, 2024 - arxiv.org
This paper investigates the quality of multi-agent dialogues in simulations powered by Large
Language Models (LLMs). Analyzing dialogues and memory over multiple sessions …
Language Models (LLMs). Analyzing dialogues and memory over multiple sessions …
ConvoCache: Smart Re-Use of Chatbot Responses
We present ConvoCache, a conversational caching system that solves the problem of slow
and expensive generative AI models in spoken chatbots. ConvoCache finds a semantically …
and expensive generative AI models in spoken chatbots. ConvoCache finds a semantically …