Exploration in deep reinforcement learning: A survey
This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …
techniques are of primary importance when solving sparse reward problems. In sparse …
Offline reinforcement learning: Tutorial, review, and perspectives on open problems
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …
started on research on offline reinforcement learning algorithms: reinforcement learning …
Combo: Conservative offline model-based policy optimization
Abstract Model-based reinforcement learning (RL) algorithms, which learn a dynamics
model from logged experience and perform conservative planning under the learned model …
model from logged experience and perform conservative planning under the learned model …
Conservative q-learning for offline reinforcement learning
Effectively leveraging large, previously collected datasets in reinforcement learn-ing (RL) is
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …
The statistical complexity of interactive decision making
A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
Poem: Out-of-distribution detection with posterior sampling
Abstract Out-of-distribution (OOD) detection is indispensable for machine learning models
deployed in the open world. Recently, the use of an auxiliary outlier dataset during training …
deployed in the open world. Recently, the use of an auxiliary outlier dataset during training …
Model-based reinforcement learning: A survey
Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …
optimization, is an important challenge in artificial intelligence. Two key approaches to this …
[图书][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
Neural approaches to conversational AI
This tutorial surveys neural approaches to conversational AI that were developed in the last
few years. We group conversational systems into three categories:(1) question answering …
few years. We group conversational systems into three categories:(1) question answering …
A simple neural attentive meta-learner
Deep neural networks excel in regimes with large amounts of data, but tend to struggle
when data is scarce or when they need to adapt quickly to changes in the task. In response …
when data is scarce or when they need to adapt quickly to changes in the task. In response …