Iq-learn: Inverse soft-q learning for imitation

D Garg, S Chakraborty, C Cundy… - Advances in Neural …, 2021 - proceedings.neurips.cc
In many sequential decision-making problems (eg, robotics control, game playing,
sequential prediction), human or expert data is available containing useful information about …

Maximum-likelihood inverse reinforcement learning with finite-time guarantees

S Zeng, C Li, A Garcia, M Hong - Advances in Neural …, 2022 - proceedings.neurips.cc
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated
optimal policy that best fits observed sequences of states and actions implemented by an …

Feedback in imitation learning: The three regimes of covariate shift

J Spencer, S Choudhury, A Venkatraman… - arXiv preprint arXiv …, 2021 - arxiv.org
Imitation learning practitioners have often noted that conditioning policies on previous
actions leads to a dramatic divergence between" held out" error and performance of the …

Inverse decision modeling: Learning interpretable representations of behavior

D Jarrett, A Hüyük… - … Conference on Machine …, 2021 - proceedings.mlr.press
Decision analysis deals with modeling and enhancing decision processes. A principal
challenge in improving behavior is in obtaining a transparent* description* of existing …

Coherent soft imitation learning

J Watson, S Huang, N Heess - Advances in Neural …, 2024 - proceedings.neurips.cc
Imitation learning methods seek to learn from an expert either through behavioral cloning
(BC) for the policy or inverse reinforcement learning (IRL) for the reward. Such methods …

Proximal point imitation learning

L Viano, A Kamoutsi, G Neu… - Advances in Neural …, 2022 - proceedings.neurips.cc
This work develops new algorithms with rigorous efficiency guarantees for infinite horizon
imitation learning (IL) with linear function approximation without restrictive coherence …

Sequencematch: Imitation learning for autoregressive sequence modelling with backtracking

C Cundy, S Ermon - arXiv preprint arXiv:2306.05426, 2023 - arxiv.org
In many domains, autoregressive models can attain high likelihood on the task of predicting
the next observation. However, this maximum-likelihood (MLE) objective does not …

A model-based solution to the offline multi-agent reinforcement learning coordination problem

P Barde, J Foerster, D Nowrouzezahrai… - arXiv preprint arXiv …, 2023 - arxiv.org
Training multiple agents to coordinate is an essential problem with applications in robotics,
game theory, economics, and social sciences. However, most existing Multi-Agent …

Diffusion imitation from observation

BR Huang, CK Yang, CM Lai, DJ Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
Learning from observation (LfO) aims to imitate experts by learning from state-only
demonstrations without requiring action labels. Existing adversarial imitation learning …

[HTML][HTML] All by myself: learning individualized competitive behavior with a contrastive reinforcement learning optimization

P Barros, A Sciutti - Neural Networks, 2022 - Elsevier
In a competitive game scenario, a set of agents have to learn decisions that maximize their
goals and minimize their adversaries' goals at the same time. Besides dealing with the …