How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven

RT McCoy, P Smolensky, T Linzen, J Gao… - Transactions of the …, 2023 - direct.mit.edu
Current language models can generate high-quality text. Are they simply copying text they
have seen before, or have they learned generalizable linguistic abstractions? To tease apart …

Naturalprover: Grounded mathematical proof generation with language models

S Welleck, J Liu, X Lu, H Hajishirzi… - Advances in Neural …, 2022 - proceedings.neurips.cc
Theorem proving in natural mathematical language–the mixture of symbolic and natural
language used by humans–plays a central role in mathematical advances and education …

LM-critic: Language models for unsupervised grammatical error correction

M Yasunaga, J Leskovec, P Liang - arXiv preprint arXiv:2109.06822, 2021 - arxiv.org
Training a model for grammatical error correction (GEC) requires a set of labeled
ungrammatical/grammatical sentence pairs, but manually annotating such pairs can be …

A survey on evaluation of summarization methods

L Ermakova, JV Cossu, J Mothe - Information processing & management, 2019 - Elsevier
The increasing volume of textual information on any topic requires its compression to allow
humans to digest it. This implies detecting the most important information and condensing it …

Dtvlt: A multi-modal diverse text benchmark for visual language tracking based on llm

X Li, S Hu, X Feng, D Zhang, M Wu, J Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Visual language tracking (VLT) has emerged as a cutting-edge research area, harnessing
linguistic data to enhance algorithms with multi-modal inputs and broadening the scope of …

MedChatZH: A tuning LLM for traditional Chinese medicine consultations

Y Tan, Z Zhang, M Li, F Pan, H Duan, Z Huang… - Computers in Biology …, 2024 - Elsevier
Abstract Generative Large Language Models (LLMs) have achieved significant success in
various natural language processing tasks, including Question-Answering (QA) and …

Reassessing the goals of grammatical error correction: Fluency instead of grammaticality

K Sakaguchi, C Napoles, M Post… - Transactions of the …, 2016 - direct.mit.edu
The field of grammatical error correction (GEC) has grown substantially in recent years, with
research directed at both evaluation metrics and improved system performance against …

Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk

J Li, X Wu, X Liu, Q Xie, P Tiwari… - Proceedings of the 61st …, 2023 - aclanthology.org
Artificial Intelligence (AI) has been widely used in Natural Language Processing (NLP),
computer vision, speech, robots, and further applied biology, etc. In NLP, Pre-trained …

[HTML][HTML] Large Language Models, scientific knowledge and factuality: A framework to streamline human expert evaluation

M Wysocka, O Wysocki, M Delmas, V Mutel… - Journal of Biomedical …, 2024 - Elsevier
Objective: The paper introduces a framework for the evaluation of the encoding of factual
scientific knowledge, designed to streamline the manual evaluation process typically …

Fireball: A dataset of dungeons and dragons actual-play with structured game state information

A Zhu, K Aggarwal, A Feng, LJ Martin… - arXiv preprint arXiv …, 2023 - arxiv.org
Dungeons & Dragons (D&D) is a tabletop roleplaying game with complex natural language
interactions between players and hidden state information. Recent work has shown that …