On improving summarization factual consistency from natural language feedback

Q Li, L Fu, W Zhang, X Chen, J Yu, W Xia… - arXiv preprint arXiv …, 2023 - arxiv.org

Online education platforms, leveraging the internet to distribute education resources, seek to
provide convenient education but often fall short in real-time communication with students …

被引用次数：25 相关文章所有 2 个版本

[PDF] arxiv.org

Training language models with language feedback at scale

J Scheurer, JA Campos, T Korbak, JS Chan… - arXiv preprint arXiv …, 2023 - arxiv.org

Pretrained language models often generate outputs that are not in line with human
preferences, such as harmful text or factually incorrect summaries. Recent work approaches …

被引用次数：91 相关文章所有 2 个版本

[PDF] arxiv.org

Shepherd: A critic for language model generation

T Wang, P Yu, XE Tan, S O'Brien, R Pasunuru… - arXiv preprint arXiv …, 2023 - arxiv.org

As large language models improve, there is increasing interest in techniques that leverage
these models' capabilities to refine their own outputs. In this work, we introduce Shepherd, a …

被引用次数：55 相关文章所有 3 个版本

[PDF] arxiv.org

Summit: Iterative text summarization via chatgpt

H Zhang, X Liu, J Zhang - arXiv preprint arXiv:2305.14835, 2023 - arxiv.org

Text summarization systems have made significant progress in recent years, but typically
generate summaries in one single step. However, the one-shot summarization setting is …

被引用次数：62 相关文章所有 4 个版本

[PDF] arxiv.org

Factkb: Generalizable factuality evaluation using language models enhanced with factual knowledge

S Feng, V Balachandran, Y Bai, Y Tsvetkov - arXiv preprint arXiv …, 2023 - arxiv.org

Evaluating the factual consistency of automatically generated summaries is essential for the
progress and adoption of reliable summarization systems. Despite recent advances, existing …

被引用次数：40 相关文章所有 4 个版本

[PDF] arxiv.org

Reasons to reject? aligning language models with judgments

W Xu, D Cai, Z Zhang, W Lam, S Shi - arXiv preprint arXiv:2312.14591, 2023 - arxiv.org

As humans, we consistently engage in interactions with our peers and receive feedback in
the form of natural language. This language feedback allows us to reflect on our actions …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

Learning to refine with fine-grained natural language feedback

M Wadhwa, X Zhao, JJ Li, G Durrett - arXiv preprint arXiv:2407.02397, 2024 - arxiv.org

Recent work has explored the capability of large language models (LLMs) to identify and
correct errors in LLM-generated responses. These refinement approaches frequently …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

A survey on effective invocation methods of massive llm services

C Wang, B Zhang, D Sui, Z Tu, X Liu, J Kang - arXiv preprint arXiv …, 2024 - arxiv.org

Language models as a service (LMaaS) enable users to accomplish tasks without requiring
specialized knowledge, simply by paying a service provider. However, numerous providers …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Aries: A corpus of scientific paper edits made in response to peer reviews

M D'Arcy, A Ross, E Bransom, B Kuehl, J Bragg… - arXiv preprint arXiv …, 2023 - arxiv.org

Revising scientific papers based on peer feedback is a challenging task that requires not
only deep scientific knowledge and reasoning, but also the ability to recognize the implicit …

被引用次数：16 相关文章所有 3 个版本

[PDF] arxiv.org

GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence

K Krishna, S Ramprasad, P Gupta, BC Wallace… - arXiv preprint arXiv …, 2024 - arxiv.org

LLMs can generate factually incorrect statements even when provided access to reference
documents. Such errors can be dangerous in high-stakes applications (eg, document …

被引用次数：6 相关文章所有 2 个版本