Stop regressing: Training value functions via classification for scalable deep rl
Value functions are a central component of deep reinforcement learning (RL). These
functions, parameterized by neural networks, are trained using a mean squared error …
functions, parameterized by neural networks, are trained using a mean squared error …
When Models Meet Data: Pragmatic Robot Learning with Model-based Optimization
M Bhardwaj - 2024 - digital.lib.washington.edu
Autonomous robots operating in complex and dynamic real-world scenarios must exhibit fast
and reactive behaviors to adapt to environment changes, and learn to improve their …
and reactive behaviors to adapt to environment changes, and learn to improve their …