Efficient methods for natural language processing: A survey
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …
scaling model parameters and training data; however, using only scale to improve …
Menli: Robust evaluation metrics from natural language inference
Recently proposed BERT-based evaluation metrics for text generation perform well on
standard benchmarks but are vulnerable to adversarial attacks, eg, relating to information …
standard benchmarks but are vulnerable to adversarial attacks, eg, relating to information …
Llms as narcissistic evaluators: When ego inflates evaluation scores
Automatic evaluation of generated textual content presents an ongoing challenge within the
field of NLP. Given the impressive capabilities of modern language models (LMs) across …
field of NLP. Given the impressive capabilities of modern language models (LMs) across …
RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue
Evaluating open-domain dialogue systems is challenging for reasons such as the one-to-
many problem, ie, many appropriate responses other than just the golden response. As of …
many problem, ie, many appropriate responses other than just the golden response. As of …
CLEME: debiasing multi-reference evaluation for grammatical error correction
Evaluating the performance of Grammatical Error Correction (GEC) systems is a challenging
task due to its subjectivity. Designing an evaluation metric that is as objective as possible is …
task due to its subjectivity. Designing an evaluation metric that is as objective as possible is …
Aligning neural machine translation models: Human feedback in training and inference
Reinforcement learning from human feedback (RLHF) is a recent technique to improve the
quality of the text generated by a language model, making it closer to what humans would …
quality of the text generated by a language model, making it closer to what humans would …
Evaluation metrics on text summarization: comprehensive survey
E Davoodijam, M Alambardar Meybodi - Knowledge and Information …, 2024 - Springer
Automatic text summarization is the process of shortening a large document into a summary
text that preserves the main concepts and key points of the original document. Due to the …
text that preserves the main concepts and key points of the original document. Due to the …
Large language models are inconsistent and biased evaluators
The zero-shot capability of Large Language Models (LLMs) has enabled highly flexible,
reference-free metrics for various tasks, making LLM evaluators common tools in NLP …
reference-free metrics for various tasks, making LLM evaluators common tools in NLP …
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
The emergence of Generative Artificial Intelligence (AI) and Large Language Models (LLMs)
has marked a new era of Natural Language Processing (NLP), introducing unprecedented …
has marked a new era of Natural Language Processing (NLP), introducing unprecedented …
ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications
Extensive efforts in the past have been directed toward the development of summarization
datasets. However, a predominant number of these resources have been (semi) …
datasets. However, a predominant number of these resources have been (semi) …