Muzero with self-competition for rate control in vp9 video compression

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

被引用次数：240 相关文章所有 2 个版本

[PDF] mlr.press

Loss of plasticity in continual deep reinforcement learning

Z Abbas, R Zhao, J Modayil, A White… - … on Lifelong Learning …, 2023 - proceedings.mlr.press

In this paper, we characterize the behavior of canonical value-based deep reinforcement
learning (RL) approaches under varying degrees of non-stationarity. In particular, we …

被引用次数：52 相关文章所有 4 个版本

[PDF] neurips.cc

Adversarial training for high-stakes reliability

D Ziegler, S Nix, L Chan, T Bauman… - Advances in …, 2022 - proceedings.neurips.cc

In the future, powerful AI systems may be deployed in high-stakes settings, where a single
failure could be catastrophic. One technique for improving AI safety in high-stakes settings is …

被引用次数：41 相关文章所有 6 个版本

[PDF] springer.com

Beyond games: a systematic review of neural Monte Carlo tree search applications

M Kemmerling, D Lütticke, RH Schmitt - Applied Intelligence, 2024 - Springer

The advent of AlphaGo and its successors marked the beginning of a new paradigm in
playing games using artificial intelligence. This was achieved by combining Monte Carlo …

被引用次数：8 相关文章所有 5 个版本

[PDF] neurips.cc

Last-iterate convergent policy gradient primal-dual methods for constrained mdps

D Ding, CY Wei, K Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc

We study the problem of computing an optimal policy of an infinite-horizon discounted
constrained Markov decision process (constrained MDP). Despite the popularity of …

被引用次数：19 相关文章所有 6 个版本

[PDF] arxiv.org

High-accuracy model-based reinforcement learning, a survey

A Plaat, W Kosters, M Preuss - Artificial Intelligence Review, 2023 - Springer

Deep reinforcement learning has shown remarkable success in the past few years. Highly
complex sequential decision making problems from game playing and robotics have been …

被引用次数：39 相关文章所有 7 个版本

[PDF] springer.com

GVFs in the real world: making predictions online for water treatment

MK Janjua, H Shah, M White, E Miahi, MC Machado… - Machine Learning, 2023 - Springer

In this paper we investigate the use of reinforcement-learning based prediction approaches
for a real drinking-water treatment plant. Developing such a prediction system is a critical …

被引用次数：3 相关文章所有 3 个版本

[PDF] thecvf.com