A definition of continual reinforcement learning

D Abel, A Barreto, B Van Roy… - Advances in …, 2024 - proceedings.neurips.cc
In a standard view of the reinforcement learning problem, an agent's goal is to efficiently
identify a policy that maximizes long-term reward. However, this perspective is based on a …

Settling the reward hypothesis

M Bowling, JD Martin, D Abel… - … on Machine Learning, 2023 - proceedings.mlr.press
The reward hypothesis posits that," all of what we mean by goals and purposes can be well
thought of as maximization of the expected value of the cumulative sum of a received scalar …

Self-predictive universal AI

E Catt, J Grau-Moya, M Hutter… - Advances in …, 2023 - proceedings.neurips.cc
Reinforcement Learning (RL) algorithms typically utilize learning and/or planning
techniques to derive effective policies. The integration of both approaches has proven to be …

Conditions on Preference Relations that Guarantee the Existence of Optimal Policies

JC Carr, P Panangaden… - … Conference on Artificial …, 2024 - proceedings.mlr.press
Abstract Learning from Preferential Feedback (LfPF) plays an essential role in training Large
Language Models, as well as certain types of interactive learning agents. However, a …

State and action abstraction for search and reinforcement learning algorithms

A Dockhorn, R Kruse - Artificial Intelligence in Control and Decision …, 2023 - Springer
Decision-making in large and dynamic environments has always been a challenge for AI
agents. Given the multitude of available sensors in robotics and the rising complexity of …

[PDF][PDF] On Reward Binarisation and Bayesian Agents

E Catt, J Veness, M Hutter - European Workshop on …, 2022 - ewrl.wordpress.com
Reward binarisation is a common heuristically applied technique which can potentially
simplify a given reinforcement learning problem. However this procedure done without care …

[引用][C] On the Foundations of Universal Artificial Intelligence

E Catt - 2022