[HTML][HTML] Deep learning, reinforcement learning, and world models

Y Matsuo, Y LeCun, M Sahani, D Precup, D Silver… - Neural Networks, 2022 - Elsevier
Deep learning (DL) and reinforcement learning (RL) methods seem to be a part of
indispensable factors to achieve human-level or super-human AI systems. On the other …

How to train your robot with deep reinforcement learning: lessons we have learned

J Ibarz, J Tan, C Finn, M Kalakrishnan… - … Journal of Robotics …, 2021 - journals.sagepub.com
Deep reinforcement learning (RL) has emerged as a promising approach for autonomously
acquiring complex behaviors from low-level sensor observations. Although a large portion of …

Learning agile soccer skills for a bipedal robot with deep reinforcement learning

T Haarnoja, B Moran, G Lever, SH Huang… - Science Robotics, 2024 - science.org
We investigated whether deep reinforcement learning (deep RL) is able to synthesize
sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be …

Soft pneumatic actuators: A review of design, fabrication, modeling, sensing, control and applications

MS Xavier, CD Tawk, A Zolfagharian, J Pinskier… - IEEE …, 2022 - ieeexplore.ieee.org
Soft robotics is a rapidly evolving field where robots are fabricated using highly deformable
materials and usually follow a bioinspired design. Their high dexterity and safety make them …

Emergent tool use from multi-agent autocurricula

B Baker, I Kanitscheider, T Markov, Y Wu… - arXiv preprint arXiv …, 2019 - arxiv.org
Through multi-agent competition, the simple objective of hide-and-seek, and standard
reinforcement learning algorithms at scale, we find that agents create a self-supervised …

Learning agile robotic locomotion skills by imitating animals

XB Peng, E Coumans, T Zhang, TW Lee, J Tan… - arXiv preprint arXiv …, 2020 - arxiv.org
Reproducing the diverse and agile locomotion skills of animals has been a longstanding
challenge in robotics. While manually-designed controllers have been able to emulate many …

Critic regularized regression

Z Wang, A Novikov, K Zolna, JS Merel… - Advances in …, 2020 - proceedings.neurips.cc
Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy
optimization from large pre-recorded datasets without online environment interaction. It …

Advantage-weighted regression: Simple and scalable off-policy reinforcement learning

XB Peng, A Kumar, G Zhang, S Levine - arXiv preprint arXiv:1910.00177, 2019 - arxiv.org
In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that
uses standard supervised learning methods as subroutines. Our goal is an algorithm that …

A distributional code for value in dopamine-based reinforcement learning

W Dabney, Z Kurth-Nelson, N Uchida, CK Starkweather… - Nature, 2020 - nature.com
Since its introduction, the reward prediction error theory of dopamine has explained a wealth
of empirical phenomena, providing a unifying framework for understanding the …

Learning agile and dynamic motor skills for legged robots

J Hwangbo, J Lee, A Dosovitskiy, D Bellicoso… - Science Robotics, 2019 - science.org
Legged robots pose one of the greatest challenges in robotics. Dynamic and agile
maneuvers of animals cannot be imitated by existing methods that are crafted by humans. A …