Should corpora texts be gold standards for NLG?

S Gehrmann, E Clark, T Sellam - Journal of Artificial Intelligence Research, 2023 - jair.org

Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …

被引用次数：133 相关文章所有 6 个版本

[PDF] jair.org

Survey of the state of the art in natural language generation: Core tasks, applications and evaluation

A Gatt, E Krahmer - Journal of Artificial Intelligence Research, 2018 - jair.org

This paper surveys the current state of the art in Natural Language Generation (NLG),
defined as the task of generating text or speech from non-linguistic input. A survey of NLG is …

被引用次数：1087 相关文章所有 15 个版本

[PDF] sciencedirect.com

Choosing words in computer-generated weather forecasts

E Reiter, S Sripada, J Hunter, J Yu, I Davy - Artificial Intelligence, 2005 - Elsevier

One of the main challenges in automatically generating textual weather forecasts is
choosing appropriate English words to communicate numeric weather data. A corpus-based …

被引用次数：437 相关文章所有 17 个版本

[图书][B] Not exactly: In praise of vagueness

K Van Deemter - 2010 - books.google.com

Not everything is black and white. Our daily lives are full of vagueness or fuzziness.
Language is the most obvious example-for instance, when we describe someone as tall, it is …

被引用次数：298 相关文章所有 6 个版本

[PDF] aclanthology.org

[PDF][PDF] Comparing automatic and human evaluation of NLG systems

A Belz, E Reiter - 11th conference of the european chapter of the …, 2006 - aclanthology.org

We consider the evaluation problem in Natural Language Generation (NLG) and present
results for evaluating several NLG systems with similar functionality, including a knowledge …

被引用次数：280 相关文章所有 5 个版本

[PDF] mit.edu

An investigation into the validity of some metrics for automatically evaluating natural language generation systems

E Reiter, A Belz - Computational Linguistics, 2009 - direct.mit.edu

There is growing interest in using automatically computed corpus-based evaluation metrics
to evaluate Natural Language Generation (NLG) systems, because these are often …

被引用次数：250 相关文章所有 12 个版本

[PDF] brighton.ac.uk

Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models

A Belz - Natural Language Engineering, 2008 - cambridge.org

Two important recent trends in natural language generation are (i) probabilistic techniques
and (ii) comprehensive approaches that move away from traditional strictly modular and …

被引用次数：265 相关文章所有 16 个版本

[PDF] jair.org

Individual and domain adaptation in sentence planning for dialogue

MA Walker, A Stent, F Mairesse, R Prasad - Journal of Artificial Intelligence …, 2007 - jair.org

One of the biggest challenges in the development and deployment of spoken dialogue
systems is the design of the spoken language generation module. This challenge arises …

被引用次数：168 相关文章所有 18 个版本

[PDF] abdn.ac.uk

Evaluating factual accuracy in complex data-to-text

C Thomson, E Reiter, B Sundararajan - Computer Speech & Language, 2023 - Elsevier

It is essential that data-to-text Natural Language Generation (NLG) systems produce texts
which are factually accurate. We examine accuracy issues in the task of generating …

被引用次数：15 相关文章所有 5 个版本

[PDF] aclanthology.org

Rethinking the agreement in human evaluation tasks

J Amidei, P Piwek, A Willis - Proceedings of the 27th International …, 2018 - aclanthology.org

Human evaluations are broadly thought to be more valuable the higher the inter-annotator
agreement. In this paper we examine this idea. We will describe our experiments and …

被引用次数：57 相关文章所有 4 个版本