Sequence level training with recurrent neural networks

Y Gao, Y Xiong, X Gao, K Jia, J Pan, Y Bi, Y Dai… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) demonstrate powerful capabilities, but they still face
challenges in practical applications, such as hallucinations, slow knowledge updates, and …

被引用次数：585 相关文章所有 4 个版本

[PDF] arxiv.org

Survey of hallucination in natural language generation

Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, E Ishii… - ACM Computing …, 2023 - dl.acm.org

Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …

被引用次数：2436 相关文章所有 7 个版本

[PDF] neurips.cc

Direct preference optimization: Your language model is secretly a reward model

R Rafailov, A Sharma, E Mitchell… - Advances in …, 2024 - proceedings.neurips.cc

While large-scale unsupervised language models (LMs) learn broad world knowledge and
some reasoning skills, achieving precise control of their behavior is difficult due to the …

被引用次数：1282 相关文章所有 9 个版本

[PDF] neurips.cc

Coderl: Mastering code generation through pretrained models and deep reinforcement learning

H Le, Y Wang, AD Gotmare… - Advances in Neural …, 2022 - proceedings.neurips.cc

Program synthesis or code generation aims to generate a program that satisfies a problem
specification. Recent approaches using large-scale pretrained language models (LMs) have …

被引用次数：251 相关文章所有 7 个版本

[PDF] neurips.cc

Fine-tuning language models to find agreement among humans with diverse preferences

M Bakker, M Chadwick, H Sheahan… - Advances in …, 2022 - proceedings.neurips.cc

Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with
the preferences of a prototypical user. This work assumes that human preferences are static …

被引用次数：169 相关文章所有 8 个版本

[PDF] arxiv.org

Unified structure generation for universal information extraction

Y Lu, Q Liu, D Dai, X Xiao, H Lin, X Han, L Sun… - arXiv preprint arXiv …, 2022 - arxiv.org

Information extraction suffers from its varying targets, heterogeneous structures, and
demand-specific schemas. In this paper, we propose a unified text-to-structure generation …

被引用次数：358 相关文章所有 4 个版本

[PDF] arxiv.org

Semantic communications for future internet: Fundamentals, applications, and challenges

W Yang, H Du, ZQ Liew, WYB Lim… - … Surveys & Tutorials, 2022 - ieeexplore.ieee.org

With the increasing demand for intelligent services, the sixth-generation (6G) wireless
networks will shift from a traditional architecture that focuses solely on a high transmission …

被引用次数：252 相关文章所有 4 个版本

[PDF] jmlr.org

Cascaded diffusion models for high fidelity image generation

J Ho, C Saharia, W Chan, DJ Fleet, M Norouzi… - Journal of Machine …, 2022 - jmlr.org

We show that cascaded diffusion models are capable of generating high fidelity images on
the class-conditional ImageNet generation benchmark, without any assistance from auxiliary …

被引用次数：950 相关文章所有 10 个版本

[PDF] arxiv.org

Contrastive decoding: Open-ended text generation as optimization

XL Li, A Holtzman, D Fried, P Liang, J Eisner… - arXiv preprint arXiv …, 2022 - arxiv.org

Given a language model (LM), maximum probability is a poor decoding objective for open-
ended generation, because it produces short and repetitive text. On the other hand …

被引用次数：185 相关文章所有 6 个版本

[PDF] arxiv.org

From show to tell: A survey on deep learning-based image captioning

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

被引用次数：330 相关文章所有 11 个版本