Fast adaptation to new environments via policy-dynamics value functions

K Xu, C Bai, X Ma, D Wang, B Zhao… - Advances in …, 2023 - proceedings.neurips.cc

Generalizing policies across different domains with dynamics mismatch poses a significant
challenge in reinforcement learning. For example, a robot learns the policy in a simulator …

被引用次数：12 相关文章所有 5 个版本

[PDF] arxiv.org

Generalization to new sequential decision making tasks with in-context learning

SC Raparthy, E Hambro, R Kirk, M Henaff… - arXiv preprint arXiv …, 2023 - arxiv.org

Training autonomous agents that can learn new tasks from only a handful of demonstrations
is a long-standing problem in machine learning. Recently, transformers have been shown to …

被引用次数：11 相关文章所有 3 个版本

[PDF] neurips.cc

Improving generalization in meta-rl with imaginary tasks from latent dynamics mixture

S Lee, SY Chung - Advances in Neural Information …, 2021 - proceedings.neurips.cc

The generalization ability of most meta-reinforcement learning (meta-RL) methods is largely
limited to test tasks that are sampled from the same distribution used to sample training …

被引用次数：22 相关文章所有 7 个版本

An Information-Assisted Deep Reinforcement Learning Path Planning Scheme for Dynamic and Unknown Underwater Environment

M Xi, J Yang, J Wen, Z Li, W Lu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

An autonomous underwater vehicle (AUV) has shown impressive potential and promising
exploitation prospects in numerous marine missions. Among its various applications, the …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Neural stochastic dual dynamic programming

H Dai, Y Xue, Z Syed, D Schuurmans, B Dai - arXiv preprint arXiv …, 2021 - arxiv.org

Stochastic dual dynamic programming (SDDP) is a state-of-the-art method for solving multi-
stage stochastic optimization, widely used for modeling real-world process optimization …

被引用次数：16 相关文章所有 7 个版本

[PDF] arxiv.org

Provably improved context-based offline meta-rl with attention and contrastive learning

L Li, Y Huang, M Chen, S Luo, D Luo… - arXiv preprint arXiv …, 2021 - arxiv.org

Meta-learning for offline reinforcement learning (OMRL) is an understudied problem with
tremendous potential impact by enabling RL algorithms in many real-world applications. A …

被引用次数：17 相关文章所有 4 个版本

[PDF] neurips.cc

Parameterizing non-parametric meta-reinforcement learning tasks via subtask decomposition

S Lee, M Cho, Y Sung - Advances in Neural Information …, 2023 - proceedings.neurips.cc

Meta-reinforcement learning (meta-RL) techniques have demonstrated remarkable success
in generalizing deep reinforcement learning across a range of tasks. Nevertheless, these …

被引用次数：2 相关文章所有 4 个版本

Achieving Fast Environment Adaptation of DRL-Based Computation Offloading in Mobile Edge Computing

Z Hu, J Niu, T Ren, M Guizani - IEEE Transactions on Mobile …, 2023 - ieeexplore.ieee.org

One of the key issues in mobile edge computing (MEC) is computation offloading, most
policies of which are developed based on mathematical programming (MP). Due to the high …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

Pandr: Fast adaptation to new environments from offline experiences via decoupling policy and environment representations

T Sang, H Tang, Y Ma, J Hao, Y Zheng, Z Meng… - arXiv preprint arXiv …, 2022 - arxiv.org

Deep Reinforcement Learning (DRL) has been a promising solution to many complex
decision-making problems. Nevertheless, the notorious weakness in generalization among …

被引用次数：7 相关文章所有 4 个版本

[PDF] mlr.press

Evaluations of the gap between supervised and reinforcement lifelong learning on robotic manipulation tasks

F Yang, C Yang, H Liu, F Sun - Conference on Robot …, 2022 - proceedings.mlr.press

Overcoming catastrophic forgetting is of great importance for deep learning and robotics.
Recent lifelong learning research has great advances in supervised learning. However, little …

被引用次数：13 相关文章所有 3 个版本