A survey of meta-reinforcement learning
While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …
machine learning, it is held back from more widespread adoption by its often poor data …
Iq-learn: Inverse soft-q learning for imitation
In many sequential decision-making problems (eg, robotics control, game playing,
sequential prediction), human or expert data is available containing useful information about …
sequential prediction), human or expert data is available containing useful information about …
A survey of inverse reinforcement learning
Learning from demonstration, or imitation learning, is the process of learning to act in an
environment from examples provided by a teacher. Inverse reinforcement learning (IRL) is a …
environment from examples provided by a teacher. Inverse reinforcement learning (IRL) is a …
Meta-reward-net: Implicitly differentiable reward learning for preference-based reinforcement learning
Abstract Setting up a well-designed reward function has been challenging for many
reinforcement learning applications. Preference-based reinforcement learning (PbRL) …
reinforcement learning applications. Preference-based reinforcement learning (PbRL) …
Generalized decision transformer for offline hindsight information matching
How to extract as much learning signal from each trajectory data has been a key problem in
reinforcement learning (RL), where sample inefficiency has posed serious challenges for …
reinforcement learning (RL), where sample inefficiency has posed serious challenges for …
Path planning using neural a* search
We present Neural A*, a novel data-driven search method for path planning problems.
Despite the recent increasing attention to data-driven path planning, machine learning …
Despite the recent increasing attention to data-driven path planning, machine learning …
Why so pessimistic? estimating uncertainties for offline rl through ensembles, and why their independence matters
K Ghasemipour, SS Gu… - Advances in Neural …, 2022 - proceedings.neurips.cc
Motivated by the success of ensembles for uncertainty estimation in supervised learning, we
take a renewed look at how ensembles of $ Q $-functions can be leveraged as the primary …
take a renewed look at how ensembles of $ Q $-functions can be leveraged as the primary …
Inverse reinforcement learning as the algorithmic basis for theory of mind: current methods and open problems
J Ruiz-Serra, MS Harré - Algorithms, 2023 - mdpi.com
Theory of mind (ToM) is the psychological construct by which we model another's internal
mental states. Through ToM, we adjust our own behaviour to best suit a social context, and …
mental states. Through ToM, we adjust our own behaviour to best suit a social context, and …
[HTML][HTML] Hard choices in artificial intelligence
As AI systems are integrated into high stakes social domains, researchers now examine how
to design and operate them in a safe and ethical manner. However, the criteria for identifying …
to design and operate them in a safe and ethical manner. However, the criteria for identifying …
Procedure planning in instructional videos via contextual modeling and model-based policy learning
Learning new skills by observing humans' behaviors is an essential capability of AI. In this
work, we leverage instructional videos to study humans' decision-making processes …
work, we leverage instructional videos to study humans' decision-making processes …