Cross-domain policy adaptation via value-guided data filtering

K Xu, C Bai, X Ma, D Wang, B Zhao… - Advances in …, 2023 - proceedings.neurips.cc
Generalizing policies across different domains with dynamics mismatch poses a significant
challenge in reinforcement learning. For example, a robot learns the policy in a simulator …

Generalization to new sequential decision making tasks with in-context learning

SC Raparthy, E Hambro, R Kirk, M Henaff… - arXiv preprint arXiv …, 2023 - arxiv.org
Training autonomous agents that can learn new tasks from only a handful of demonstrations
is a long-standing problem in machine learning. Recently, transformers have been shown to …

Improving generalization in meta-rl with imaginary tasks from latent dynamics mixture

S Lee, SY Chung - Advances in Neural Information …, 2021 - proceedings.neurips.cc
The generalization ability of most meta-reinforcement learning (meta-RL) methods is largely
limited to test tasks that are sampled from the same distribution used to sample training …

An Information-Assisted Deep Reinforcement Learning Path Planning Scheme for Dynamic and Unknown Underwater Environment

M Xi, J Yang, J Wen, Z Li, W Lu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
An autonomous underwater vehicle (AUV) has shown impressive potential and promising
exploitation prospects in numerous marine missions. Among its various applications, the …

Neural stochastic dual dynamic programming

H Dai, Y Xue, Z Syed, D Schuurmans, B Dai - arXiv preprint arXiv …, 2021 - arxiv.org
Stochastic dual dynamic programming (SDDP) is a state-of-the-art method for solving multi-
stage stochastic optimization, widely used for modeling real-world process optimization …

Provably improved context-based offline meta-rl with attention and contrastive learning

L Li, Y Huang, M Chen, S Luo, D Luo… - arXiv preprint arXiv …, 2021 - arxiv.org
Meta-learning for offline reinforcement learning (OMRL) is an understudied problem with
tremendous potential impact by enabling RL algorithms in many real-world applications. A …

Parameterizing non-parametric meta-reinforcement learning tasks via subtask decomposition

S Lee, M Cho, Y Sung - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Meta-reinforcement learning (meta-RL) techniques have demonstrated remarkable success
in generalizing deep reinforcement learning across a range of tasks. Nevertheless, these …

Achieving Fast Environment Adaptation of DRL-Based Computation Offloading in Mobile Edge Computing

Z Hu, J Niu, T Ren, M Guizani - IEEE Transactions on Mobile …, 2023 - ieeexplore.ieee.org
One of the key issues in mobile edge computing (MEC) is computation offloading, most
policies of which are developed based on mathematical programming (MP). Due to the high …

Pandr: Fast adaptation to new environments from offline experiences via decoupling policy and environment representations

T Sang, H Tang, Y Ma, J Hao, Y Zheng, Z Meng… - arXiv preprint arXiv …, 2022 - arxiv.org
Deep Reinforcement Learning (DRL) has been a promising solution to many complex
decision-making problems. Nevertheless, the notorious weakness in generalization among …

Evaluations of the gap between supervised and reinforcement lifelong learning on robotic manipulation tasks

F Yang, C Yang, H Liu, F Sun - Conference on Robot …, 2022 - proceedings.mlr.press
Overcoming catastrophic forgetting is of great importance for deep learning and robotics.
Recent lifelong learning research has great advances in supervised learning. However, little …