Human-in-the-loop reinforcement learning: A survey and position on requirements, challenges, and opportunities

CO Retzlaff, S Das, C Wayllace, P Mousavi… - Journal of Artificial …, 2024 - jair.org
Artificial intelligence (AI) and especially reinforcement learning (RL) have the potential to
enable agents to learn and perform tasks autonomously with superhuman performance …

Secure-by-construction synthesis of cyber-physical systems

S Liu, A Trivedi, X Yin, M Zamani - Annual Reviews in Control, 2022 - Elsevier
Correct-by-construction synthesis is a cornerstone of the confluence of formal methods and
control theory towards designing safety-critical systems. Instead of following the time-tested …

Reward machines: Exploiting reward function structure in reinforcement learning

RT Icarte, TQ Klassen, R Valenzano… - Journal of Artificial …, 2022 - jair.org
Reinforcement learning (RL) methods usually treat reward functions as black boxes. As
such, these methods must extensively interact with the environment in order to discover …

[PDF][PDF] LTL and Beyond: Formal Languages for Reward Function Specification in Reinforcement Learning.

A Camacho, RT Icarte, TQ Klassen, RA Valenzano… - IJCAI, 2019 - ijcai.org
Abstract In Reinforcement Learning (RL), an agent is guided by the rewards it receives from
the reward function. Unfortunately, it may take many interactions with the environment to …

On the expressivity of markov reward

D Abel, W Dabney, A Harutyunyan… - Advances in …, 2021 - proceedings.neurips.cc
Reward is the driving force for reinforcement-learning agents. This paper is dedicated to
understanding the expressivity of reward as a way to capture tasks that we would want an …

The perils of trial-and-error reward design: misdesign through overfitting and invalid task specifications

S Booth, WB Knox, J Shah, S Niekum, P Stone… - Proceedings of the …, 2023 - ojs.aaai.org
In reinforcement learning (RL), a reward function that aligns exactly with a task's true
performance metric is often necessarily sparse. For example, a true task metric might …

Toward verified artificial intelligence

SA Seshia, D Sadigh, SS Sastry - Communications of the ACM, 2022 - dl.acm.org
Toward verified artificial intelligence Page 1 46 COMMUNICATIONS OF THE ACM | JULY
2022 | VOL. 65 | NO. 7 contributed articles ILL US TRA TION B Y PETER CRO W THER A …

[PDF][PDF] Explainable reinforcement learning via reward decomposition

Z Juozapaitis, A Koul, A Fern, M Erwig… - IJCAI/ECAI Workshop on …, 2019 - par.nsf.gov
We study reward decomposition for explaining the decisions of reinforcement learning (RL)
agents. The approach decomposes rewards into sums of semantically meaningful reward …

A survey on interpretable reinforcement learning

C Glanois, P Weng, M Zimmer, D Li, T Yang, J Hao… - Machine Learning, 2024 - Springer
Although deep reinforcement learning has become a promising machine learning approach
for sequential decision-making problems, it is still not mature enough for high-stake domains …

Compositional reinforcement learning from logical specifications

K Jothimurugan, S Bansal… - Advances in Neural …, 2021 - proceedings.neurips.cc
We study the problem of learning control policies for complex tasks given by logical
specifications. Recent approaches automatically generate a reward function from a given …