A tour of reinforcement learning: The view from continuous control
B Recht - Annual Review of Control, Robotics, and Autonomous …, 2019 - annualreviews.org
This article surveys reinforcement learning from the perspective of optimization and control,
with a focus on continuous control applications. It reviews the general formulation …
with a focus on continuous control applications. It reviews the general formulation …
Fine-tuning language models with just forward passes
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …
[图书][B] Control systems and reinforcement learning
S Meyn - 2022 - books.google.com
A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
Derivative-free optimization methods
In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …
applications, objective and constraint functions are available only as the output of a black …
Simple random search of static linear policies is competitive for reinforcement learning
Abstract Model-free reinforcement learning aims to offer off-the-shelf solutions for controlling
dynamical systems without requiring models of the system dynamics. We introduce a model …
dynamical systems without requiring models of the system dynamics. We introduce a model …
Simple random search provides a competitive approach to reinforcement learning
A common belief in model-free reinforcement learning is that methods based on random
search in the parameter space of policies exhibit significantly worse sample complexity than …
search in the parameter space of policies exhibit significantly worse sample complexity than …
Derivative-free methods for policy optimization: Guarantees for linear quadratic systems
We study derivative-free methods for policy optimization over the class of linear policies. We
focus on characterizing the convergence rate of these methods when applied to linear …
focus on characterizing the convergence rate of these methods when applied to linear …
Optimal rates for zero-order convex optimization: The power of two function evaluations
We consider derivative-free algorithms for stochastic and nonstochastic convex optimization
problems that use only function values rather than gradients. Focusing on nonasymptotic …
problems that use only function values rather than gradients. Focusing on nonasymptotic …
Radiative backpropagation: An adjoint method for lightning-fast differentiable rendering
M Nimier-David, S Speierer, B Ruiz… - ACM Transactions on …, 2020 - dl.acm.org
Physically based differentiable rendering has recently evolved into a powerful tool for
solving inverse problems involving light. Methods in this area perform a differentiable …
solving inverse problems involving light. Methods in this area perform a differentiable …