Safe policies for reinforcement learning via primal-dual methods
S Paternain, M Calvo-Fullana… - … on Automatic Control, 2022 - ieeexplore.ieee.org
In this article, we study the design of controllers in the context of stochastic optimal control
under the assumption that the model of the system is not available. This is, we aim to control …
under the assumption that the model of the system is not available. This is, we aim to control …
Communication-efficient policy gradient methods for distributed reinforcement learning
This article deals with distributed policy optimization in reinforcement learning, which
involves a central controller and a group of learners. In particular, two typical settings …
involves a central controller and a group of learners. In particular, two typical settings …
The Path to Defence: A Roadmap to Characterising Data Poisoning Attacks on Victim Models
Data Poisoning Attacks (DPA) represent a sophisticated technique aimed at distorting the
training data of machine learning models, thereby manipulating their behavior. This process …
training data of machine learning models, thereby manipulating their behavior. This process …
Learning safe policies via primal-dual methods
S Paternain, M Calvo-Fullana… - 2019 IEEE 58th …, 2019 - ieeexplore.ieee.org
In this paper, we study the learning of safe policies in the setting of reinforcement learning
problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not …
problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not …
On the sample complexity and metastability of heavy-tailed policy search in continuous control
Reinforcement learning is a framework for interactive decision-making with incentives
sequentially revealed across time without a system dynamics model. Due to its scaling to …
sequentially revealed across time without a system dynamics model. Due to its scaling to …
Autonomous Driving Control for Passing Unsignalized Intersections Using the Semantic Segmentation Technique
J Tsai, YT Chang, ZY Chen, Z You - Electronics, 2024 - mdpi.com
Autonomous driving in urban areas is challenging because it requires understanding
vehicle movements, traffic rules, map topologies and unknown environments in the highly …
vehicle movements, traffic rules, map topologies and unknown environments in the highly …
Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs
We study the problem of computing deterministic optimal policies for constrained Markov
decision processes (MDPs) with continuous state and action spaces, which are widely …
decision processes (MDPs) with continuous state and action spaces, which are widely …
Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective
Generative Adversarial Imitation Learning (GAIL) stands as a cornerstone approach in
imitation learning. This paper investigates the gradient explosion in two types of GAIL: GAIL …
imitation learning. This paper investigates the gradient explosion in two types of GAIL: GAIL …
Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning
Reinforcement learning (RL) aims to estimate the action to take given a (time-varying) state,
with the goal of maximizing a cumulative reward function. Predominantly, there are two …
with the goal of maximizing a cumulative reward function. Predominantly, there are two …
Towards delivering a coherent self-contained explanation of proximal policy optimization
D Bick - 2021 - fse.studenttheses.ub.rug.nl
Reinforcement Learning (RL), and these days particularly Deep Reinforcement Learning
(DRL), is concerned with the development, study, and application of algorithms that are …
(DRL), is concerned with the development, study, and application of algorithms that are …