On stochastic optimal control and reinforcement learning by approximate inference

K Rawlik, M Toussaint, S Vijayakumar - 2013 - direct.mit.edu
We present a reformulation of the stochastic optimal control problem in terms of KL
divergence minimisation, not only providing a unifying perspective of previous approaches …

Probabilistic planning with sequential monte carlo methods

A Piché, V Thomas, C Ibrahim, Y Bengio… - … Conference on Learning …, 2018 - openreview.net
In this work, we propose a novel formulation of planning which views it as a probabilistic
inference problem over future optimal trajectories. This enables us to use sampling methods …

Toward asymptotically optimal motion planning for kinodynamic systems using a two-point boundary value problem solver

C Xie, J Van Den Berg, S Patil… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
We present an approach for asymptotically optimal motion planning for kinodynamic
systems with arbitrary nonlinear dynamics amid obstacles. Optimal sampling-based …

Space-time functional gradient optimization for motion planning

A Byravan, B Boots, SS Srinivasa… - 2014 IEEE International …, 2014 - ieeexplore.ieee.org
Functional gradient algorithms (eg CHOMP) have recently shown great promise for
producing locally optimal motion for complex many degree-of-freedom robots. A key …

A real-world application of Markov chain Monte Carlo method for Bayesian trajectory control of a robotic manipulator

VT Aghaei, A Ağababaoğlu, S Yıldırım, A Onat - ISA transactions, 2022 - Elsevier
Reinforcement learning methods are being applied to control problems in robotics domain.
These algorithms are well suited for dealing with the continuous large scale state spaces in …

Topology-based representations for motion planning and generalization in dynamic environments with interactions

V Ivan, D Zarubin, M Toussaint… - … Journal of Robotics …, 2013 - journals.sagepub.com
Motion can be described in several alternative representations, including joint configuration
or end-effector spaces, but also more complex topology-based representations that imply a …

Outcome-driven reinforcement learning via variational inference

TGJ Rudner, V Pong, R McAllister… - Advances in Neural …, 2021 - proceedings.neurips.cc
While reinforcement learning algorithms provide automated acquisition of optimal policies,
practical application of such methods requires a number of design decisions, such as …

Extended LQR: Locally-optimal feedback control for systems with non-linear dynamics and non-quadratic cost

J Van Den Berg - Robotics Research: The 16th International Symposium …, 2016 - Springer
We present Extended LQR, a novel approach for locally-optimal control for robots with non-
linear dynamics and non-quadratic cost functions. Our formulation is conceptually different …

Exploiting variable physical damping in rapid movement tasks

A Radulescu, M Howard, DJ Braun… - 2012 IEEE/ASME …, 2012 - ieeexplore.ieee.org
Until now, design of variable physical impedance actuators (VIAs) has been limited mainly to
realising variable stiffness while other components of impedance shaping, such as damping …

Black-box policy search with probabilistic programs

JW Vandemeent, B Paige, D Tolpin… - Artificial Intelligence …, 2016 - proceedings.mlr.press
In this work we show how to represent policies as programs: that is, as stochastic simulators
with tunable parameters. To learn the parameters of such policies we develop connections …