Dynamics generalization via information bottleneck in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

被引用次数：400 相关文章所有 9 个版本

[PDF] neurips.cc

Why generalization in rl is difficult: Epistemic pomdps and implicit partial observability

D Ghosh, J Rahme, A Kumar, A Zhang… - Advances in neural …, 2021 - proceedings.neurips.cc

Generalization is a central challenge for the deployment of reinforcement learning (RL)
systems in the real world. In this paper, we show that the sequential structure of the RL …

被引用次数：122 相关文章所有 10 个版本

[PDF] neurips.cc

Efficient knowledge distillation from model checkpoints

C Wang, Q Yang, R Huang, S Song… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Knowledge distillation is an effective approach to learn compact models (students)
with the supervision of large and strong models (teachers). As empirically there exists a …

被引用次数：47 相关文章所有 6 个版本

[PDF] neurips.cc

PID-inspired inductive biases for deep reinforcement learning in partially observable control tasks

I Char, J Schneider - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Deep reinforcement learning (RL) has shown immense potential for learning to control
systems through data alone. However, one challenge deep RL faces is that the full state of …

被引用次数：5 相关文章所有 5 个版本

[PDF] neurips.cc

Robust predictable control

B Eysenbach, RR Salakhutdinov… - Advances in Neural …, 2021 - proceedings.neurips.cc

Many of the challenges facing today's reinforcement learning (RL) algorithms, such as
robustness, generalization, transfer, and computational efficiency are closely related to …

被引用次数：42 相关文章所有 7 个版本

[PDF] arxiv.org

Selective visual representations improve convergence and generalization for embodied ai

A Eftekhar, KH Zeng, J Duan, A Farhadi… - arXiv preprint arXiv …, 2023 - arxiv.org

Embodied AI models often employ off the shelf vision backbones like CLIP to encode their
visual observations. Although such general purpose representations encode rich syntactic …

被引用次数：10 相关文章所有 3 个版本

[PDF] neurips.cc

Dynamics generalisation in reinforcement learning via adaptive context-aware policies

M Beukman, D Jarvis, R Klein… - Advances in Neural …, 2024 - proceedings.neurips.cc

While reinforcement learning has achieved remarkable successes in several domains, its
real-world application is limited due to many methods failing to generalise to unfamiliar …

被引用次数：13 相关文章所有 12 个版本

[PDF] wiley.com Full View

Markov‐GAN: Markov image enhancement method for malicious encrypted traffic classification

Z Tang, J Wang, B Yuan, H Li, J Zhang… - IET Information …, 2022 - Wiley Online Library

The rapidly growing encrypted traffic hides a large number of malicious behaviours. The
difficulty of collecting and labelling encrypted traffic makes the class distribution of dataset …

被引用次数：21 相关文章所有 4 个版本

A deep residual reinforcement learning algorithm based on Soft Actor-Critic for autonomous navigation

S Wen, Y Shu, A Rad, Z Wen, Z Guo, S Gong - Expert Systems with …, 2025 - Elsevier

The problem of autonomous navigation has attracted significant attention from robotics
research community in the last few decades. In this paper, we address the problem of low …

被引用次数：1 相关文章所有 2 个版本

[PDF] aaai.org

Learn goal-conditioned policy with intrinsic motivation for deep reinforcement learning

J Liu, D Wang, Q Tian, Z Chen - Proceedings of the AAAI conference on …, 2022 - ojs.aaai.org

It is of significance for an agent to autonomously explore the environment and learn a widely
applicable and general-purpose goal-conditioned policy that can achieve diverse goals …

被引用次数：22 相关文章所有 7 个版本