Data-efficient off-policy policy evaluation for reinforcement learning

S Levine, A Kumar, G Tucker, J Fu - arXiv preprint arXiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

被引用次数：2104 相关文章所有 3 个版本

[PDF] arxiv.org

Reinforcement learning in healthcare: A survey

C Yu, J Liu, S Nemati, G Yin - ACM Computing Surveys (CSUR), 2021 - dl.acm.org

As a subfield of machine learning, reinforcement learning (RL) aims at optimizing decision
making by using interaction samples of an agent with its environment and the potentially …

被引用次数：766 相关文章所有 5 个版本

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y Jin, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

被引用次数：446 相关文章所有 7 个版本

[PDF] arxiv.org

Awac: Accelerating online reinforcement learning with offline datasets

A Nair, A Gupta, M Dalal, S Levine - arXiv preprint arXiv:2006.09359, 2020 - arxiv.org

Reinforcement learning (RL) provides an appealing formalism for learning control policies
from experience. However, the classic active formulation of RL necessitates a lengthy active …

被引用次数：620 相关文章所有 7 个版本

[PDF] nowpublishers.com

An introduction to deep reinforcement learning

V François-Lavet, P Henderson, R Islam… - … and Trends® in …, 2018 - nowpublishers.com

Deep reinforcement learning is the combination of reinforcement learning (RL) and deep
learning. This field of research has been able to solve a wide range of complex …

被引用次数：1932 相关文章所有 16 个版本

[PDF] nematilab.info

The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care

M Komorowski, LA Celi, O Badawi, AC Gordon… - Nature medicine, 2018 - nature.com

Sepsis is the third leading cause of death worldwide and the main cause of mortality in
hospitals,–, but the best treatment strategy remains uncertain. In particular, evidence …

被引用次数：1149 相关文章所有 11 个版本

[PDF] annualreviews.org

Machine learning methods that economists should know about

S Athey, GW Imbens - Annual Review of Economics, 2019 - annualreviews.org

We discuss the relevance of the recent machine learning (ML) literature for economics and
econometrics. First we discuss the differences in goals, methods, and settings between the …

被引用次数：1122 相关文章所有 11 个版本

[PDF] nowpublishers.com

Neural approaches to conversational AI

J Gao, M Galley, L Li - The 41st international ACM SIGIR conference on …, 2018 - dl.acm.org

This tutorial surveys neural approaches to conversational AI that were developed in the last
few years. We group conversational systems into three categories:(1) question answering …

被引用次数：915 相关文章所有 16 个版本

[PDF] neurips.cc

Provable benefits of actor-critic methods for offline reinforcement learning

A Zanette, MJ Wainwright… - Advances in neural …, 2021 - proceedings.neurips.cc

Actor-critic methods are widely used in offline reinforcement learningpractice, but are not so
well-understood theoretically. We propose a newoffline actor-critic algorithm that naturally …

被引用次数：141 相关文章所有 8 个版本

[PDF] aaai.org

Deep reinforcement learning that matters

P Henderson, R Islam, P Bachman, J Pineau… - Proceedings of the …, 2018 - ojs.aaai.org

In recent years, significant progress has been made in solving challenging problems across
various domains using deep reinforcement learning (RL). Reproducing existing work and …

被引用次数：2566 相关文章所有 19 个版本