Multi-task reinforcement learning: a hierarchical bayesian approach

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org

In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

被引用次数：320 相关文章所有 9 个版本

[PDF] oup.com

An overview of multi-task learning

Y Zhang, Q Yang - National Science Review, 2018 - academic.oup.com

As a promising area in machine learning, multi-task learning (MTL) aims to improve the
performance of multiple related learning tasks by leveraging useful information among them …

被引用次数：930 相关文章所有 8 个版本

[PDF] neurips.cc

Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2024 - proceedings.neurips.cc

Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …

被引用次数：93 相关文章所有 7 个版本

[PDF] jsdelivr.net

[PDF][PDF] 深度强化学习综述

刘全，翟建伟，章宗长，钟珊，周倩，章鹏，徐进 - 计算机学报, 2018 - cdn.jsdelivr.net

:强化学习是学习环境状态到动作的一种映射,并且能够获得最大的奖赏信号.在大规模状 Page 1
第40 卷计算机学报 Vol. 40 2017 年论文在线出版号No.1 CHINESE JOURNAL OF …

被引用次数：119 相关文章所有 6 个版本

[PDF] neurips.cc

Contrastive learning as goal-conditioned reinforcement learning

B Eysenbach, T Zhang, S Levine… - Advances in Neural …, 2022 - proceedings.neurips.cc

In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often …

被引用次数：134 相关文章所有 6 个版本

[PDF] arxiv.org

Multi-task learning for dense prediction tasks: A survey

S Vandenhende, S Georgoulis… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

With the advent of deep learning, many dense prediction tasks, ie, tasks that produce pixel-
level predictions, have seen significant performance improvements. The typical approach is …

被引用次数：791 相关文章所有 11 个版本

[PDF] neurips.cc

Gradient surgery for multi-task learning

T Yu, S Kumar, A Gupta, S Levine… - Advances in Neural …, 2020 - proceedings.neurips.cc

While deep learning and deep reinforcement learning (RL) systems have demonstrated
impressive results in domains such as image classification, game playing, and robotic …

被引用次数：1084 相关文章所有 8 个版本

[PDF] jmlr.org

Curriculum learning for reinforcement learning domains: A framework and survey

S Narvekar, B Peng, M Leonetti, J Sinapov… - Journal of Machine …, 2020 - jmlr.org

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks
in which the agent has only limited environmental feedback. Despite many advances over …

被引用次数：590 相关文章所有 11 个版本

[PDF] nowpublishers.com

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com

Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

被引用次数：891 相关文章所有 17 个版本

[PDF] neurips.cc

A definition of continual reinforcement learning

D Abel, A Barreto, B Van Roy… - Advances in …, 2024 - proceedings.neurips.cc

In a standard view of the reinforcement learning problem, an agent's goal is to efficiently
identify a policy that maximizes long-term reward. However, this perspective is based on a …

被引用次数：62 相关文章所有 8 个版本