Batch policy gradient methods for improving neural conversation models

S Levine, A Kumar, G Tucker, J Fu - arXiv preprint arXiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

被引用次数：2099 相关文章所有 3 个版本

[PDF] jair.org Full View

Reinforcement learning for generative ai: State of the art, opportunities and open research challenges

G Franceschelli, M Musolesi - Journal of Artificial Intelligence Research, 2024 - jair.org

Abstract Generative Artificial Intelligence (AI) is one of the most exciting developments in
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …

被引用次数：16 相关文章所有 10 个版本

[PDF] neurips.cc

Rambo-rl: Robust adversarial model-based offline reinforcement learning

M Rigter, B Lacerda, N Hawes - Advances in neural …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) aims to find performant policies from logged data without
further environment interaction. Model-based algorithms, which learn a model of the …

被引用次数：126 相关文章所有 7 个版本

[PDF] arxiv.org

Deep reinforcement learning

SE Li - Reinforcement learning for sequential decision and …, 2023 - Springer

Similar to humans, RL agents use interactive learning to successfully obtain satisfactory
decision strategies. However, in many cases, it is desirable to learn directly from …

被引用次数：426 相关文章所有 9 个版本

[PDF] arxiv.org

Machine comprehension by text-to-text neural question generation

X Yuan, T Wang, C Gulcehre, A Sordoni… - arXiv preprint arXiv …, 2017 - arxiv.org

We propose a recurrent neural model that generates natural-language questions from
documents, conditioned on answers. We show how to train the model using a combination …

被引用次数：224 相关文章所有 7 个版本

[PDF] arxiv.org

Deep learning based chatbot models

R Csaky - arXiv preprint arXiv:1908.08835, 2019 - arxiv.org

A conversational agent (chatbot) is a piece of software that is able to communicate with
humans using natural language. Modeling conversation is an important task in natural …

被引用次数：113 相关文章所有 6 个版本

[PDF] arxiv.org

Improving neural conversational models with entropy-based data filtering

R Csáky, P Purgai, G Recski - arXiv preprint arXiv:1905.05471, 2019 - arxiv.org

Current neural network-based conversational models lack diversity and generate boring
responses to open-ended utterances. Priors such as persona, emotion, or topic provide …

被引用次数：67 相关文章所有 8 个版本

[PDF] googleapis.com

Conversational agent

Y Bachrach, SJ Coope, CJ Mcmurtrie - US Patent 10,515,155, 2019 - Google Patents

Certain examples described herein provide methods and systems for implementing a
conversational agent, eg to train a predictive model used by the conversational agent. In …

被引用次数：62 相关文章所有 4 个版本

[PDF] openreview.net

Text Generation with Efficient (Soft) -Learning

H Guo, B Tan, Z Liu, E Xing, Z Hu - 2021 - openreview.net

Maximum likelihood estimation (MLE) is the predominant algorithm for training text
generation models. This paradigm relies on direct supervision examples, which is not …

被引用次数：31 相关文章所有 2 个版本

[PDF] arxiv.org

Improving conditional sequence generative adversarial networks by stepwise evaluation

YL Tuan, HY Lee - IEEE/ACM Transactions on Audio, Speech …, 2019 - ieeexplore.ieee.org

Sequence generative adversarial networks (SeqGAN) have been used to improve
conditional sequence generation tasks, for example, chit-chat dialogue generation. To …

被引用次数：73 相关文章所有 7 个版本