Enforcing hard constraints with soft barriers: Safe reinforcement learning in unknown stochastic environments

Y Wang, SS Zhan, R Jiao, Z Wang… - International …, 2023 - proceedings.mlr.press
It is quite challenging to ensure the safety of reinforcement learning (RL) agents in an
unknown and stochastic environment under hard constraints that require the system state …

Near-optimal model-free reinforcement learning in non-stationary episodic mdps

W Mao, K Zhang, R Zhu… - … on Machine Learning, 2021 - proceedings.mlr.press
We consider model-free reinforcement learning (RL) in non-stationary Markov decision
processes. Both the reward functions and the state transition functions are allowed to vary …

Learning mixtures of linear dynamical systems

Y Chen, HV Poor - International conference on machine …, 2022 - proceedings.mlr.press
We study the problem of learning a mixture of multiple linear dynamical systems (LDSs) from
unlabeled short sample trajectories, each generated by one of the LDS models. Despite the …

Ctrlformer: Learning transferable state representation for visual control via transformer

Y Mu, S Chen, M Ding, J Chen, R Chen… - arXiv preprint arXiv …, 2022 - arxiv.org
Transformer has achieved great successes in learning vision and language representation,
which is general across various downstream tasks. In visual control, learning transferable …

Model-based transfer reinforcement learning based on graphical model representations

Y Sun, K Zhang, C Sun - IEEE Transactions on Neural …, 2021 - ieeexplore.ieee.org
Reinforcement learning (RL) plays an essential role in the field of artificial intelligence but
suffers from data inefficiency and model-shift issues. One possible solution to deal with such …

Transfer reinforcement learning via meta-knowledge extraction using auto-pruned decision trees

Y Lan, X Xu, Q Fang, Y Zeng, X Liu, X Zhang - Knowledge-Based Systems, 2022 - Elsevier
Transfer reinforcement learning (RL) has recently received increasing attention to make RL
agents have better learning performance in target Markov decision problems (MDPs) by …

Learning in non-cooperative configurable markov decision processes

G Ramponi, AM Metelli, A Concetti… - Advances in Neural …, 2021 - proceedings.neurips.cc
Abstract The Configurable Markov Decision Process framework includes two entities: a
Reinforcement Learning agent and a configurator that can modify some environmental …

[PDF][PDF] TA-Explore: Teacher-assisted exploration for facilitating fast reinforcement learning

A Beikmohammadi, S Magnússon - Proceedings of the 2023 …, 2023 - ifaamas.org
Reinforcement Learning (RL) is crucial for data-driven decisionmaking but suffers from
sample inefficiency. This poses a risk to system safety and can be costly in real-world …

Model-free non-stationary rl: Near-optimal regret and applications in multi-agent rl and inventory control

W Mao, K Zhang, R Zhu, D Simchi-Levi… - arXiv preprint arXiv …, 2020 - arxiv.org
We consider model-free reinforcement learning (RL) in non-stationary Markov decision
processes. Both the reward functions and the state transition functions are allowed to vary …

Temple: Learning template of transitions for sample efficient multi-task rl

Y Sun, X Yin, F Huang - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
Transferring knowledge among various environments is important for efficiently learning
multiple tasks online. Most existing methods directly use the previously learned models or …