Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arXiv preprint arXiv:2005.01643, 2020 - arxiv.org
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

Reinforcement learning for generative ai: State of the art, opportunities and open research challenges

G Franceschelli, M Musolesi - Journal of Artificial Intelligence Research, 2024 - jair.org
Abstract Generative Artificial Intelligence (AI) is one of the most exciting developments in
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …

Rambo-rl: Robust adversarial model-based offline reinforcement learning

M Rigter, B Lacerda, N Hawes - Advances in neural …, 2022 - proceedings.neurips.cc
Offline reinforcement learning (RL) aims to find performant policies from logged data without
further environment interaction. Model-based algorithms, which learn a model of the …

Deep reinforcement learning

SE Li - Reinforcement learning for sequential decision and …, 2023 - Springer
Similar to humans, RL agents use interactive learning to successfully obtain satisfactory
decision strategies. However, in many cases, it is desirable to learn directly from …

Machine comprehension by text-to-text neural question generation

X Yuan, T Wang, C Gulcehre, A Sordoni… - arXiv preprint arXiv …, 2017 - arxiv.org
We propose a recurrent neural model that generates natural-language questions from
documents, conditioned on answers. We show how to train the model using a combination …

Deep learning based chatbot models

R Csaky - arXiv preprint arXiv:1908.08835, 2019 - arxiv.org
A conversational agent (chatbot) is a piece of software that is able to communicate with
humans using natural language. Modeling conversation is an important task in natural …

Improving neural conversational models with entropy-based data filtering

R Csáky, P Purgai, G Recski - arXiv preprint arXiv:1905.05471, 2019 - arxiv.org
Current neural network-based conversational models lack diversity and generate boring
responses to open-ended utterances. Priors such as persona, emotion, or topic provide …

Conversational agent

Y Bachrach, SJ Coope, CJ Mcmurtrie - US Patent 10,515,155, 2019 - Google Patents
Certain examples described herein provide methods and systems for implementing a
conversational agent, eg to train a predictive model used by the conversational agent. In …

Text Generation with Efficient (Soft) -Learning

H Guo, B Tan, Z Liu, E Xing, Z Hu - 2021 - openreview.net
Maximum likelihood estimation (MLE) is the predominant algorithm for training text
generation models. This paradigm relies on direct supervision examples, which is not …

Improving conditional sequence generative adversarial networks by stepwise evaluation

YL Tuan, HY Lee - IEEE/ACM Transactions on Audio, Speech …, 2019 - ieeexplore.ieee.org
Sequence generative adversarial networks (SeqGAN) have been used to improve
conditional sequence generation tasks, for example, chit-chat dialogue generation. To …