Honor of kings arena: an environment for generalization in competitive reinforcement learning

X Xu, Y Wang, C Xu, Z Ding, J Jiang, Z Ding… - arXiv preprint arXiv …, 2024 - arxiv.org

The swift evolution of Large-scale Models (LMs), either language-focused or multi-modal,
has garnered extensive attention in both academy and industry. But despite the surge in …

被引用次数：4 相关文章所有 5 个版本

[PDF] neurips.cc

Rl-vigen: A reinforcement learning benchmark for visual generalization

Z Yuan, S Yang, P Hua, C Chang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Visual Reinforcement Learning (Visual RL), coupled with high-dimensional
observations, has consistently confronted the long-standing challenge of out-of-distribution …

被引用次数：6 相关文章所有 5 个版本

[PDF] aaai.org

Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with Prompt Learning

L Da, M Gao, H Mei, H Wei - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Numerous methods are proposed for the Traffic Signal Control (TSC) tasks aiming to provide
efficient transportation and mitigate congestion waste. In recent, promising results have …

被引用次数：4 相关文章

Games for Artificial Intelligence Research: A Review and Perspectives

C Hu, Y Zhao, Z Wang, H Du… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Games have been the perfect test-beds for artificial intelligence research for the
characteristics that widely exist in real-world scenarios. Learning and optimisation, decision …

被引用次数：1 相关文章所有 2 个版本

[PDF] neurips.cc

Hokoff: real game dataset from honor of kings and its offline reinforcement learning benchmarks

Y Qu, B Wang, J Shao, Y Jiang… - Advances in …, 2024 - proceedings.neurips.cc

Abstract The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent
Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Revisiting discrete soft actor-critic

H Zhou, Z Lin, J Li, Q Fu, W Yang, D Ye - arXiv preprint arXiv:2209.10081, 2022 - arxiv.org

We study the adaption of soft actor-critic (SAC) from continuous action space to discrete
action space. We revisit vanilla SAC and provide an in-depth understanding of its Q value …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Tackling cooperative incompatibility for zero-shot human-ai coordination

Y Li, S Zhang, J Sun, W Zhang, Y Du, Y Wen… - arXiv preprint arXiv …, 2023 - arxiv.org

Securing coordination between AI agent and teammates (human players or AI agents) in
contexts involving unfamiliar humans continues to pose a significant challenge in Zero-Shot …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Technical challenges of deploying reinforcement learning agents for game testing in aaa games

J Gillberg, J Bergdahl, A Sestini… - … IEEE Conference on …, 2023 - ieeexplore.ieee.org

Going from research to production, especially for large and complex software systems, is
fundamentally a hard problem. In large-scale game production, one of the main reasons is …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Enhancing Human Experience in Human-Agent Collaboration: A Human-Centered Modeling Approach Based on Positive Human Gain

Y Gao, F Liu, L Wang, Z Lian, D Zheng, W Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Existing game AI research mainly focuses on enhancing agents' abilities to win games, but
this does not inherently make humans have a better experience when collaborating with …

被引用次数：1 相关文章所有 3 个版本

[PDF] aaai.org

Probabilistic Offline Policy Ranking with Approximate Bayesian Computation

L Da, P Jenkins, T Schwantes, J Dotson… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

In practice, it is essential to compare and rank candidate policies offline before real-world
deployment for safety and reliability. Prior work seeks to solve this offline policy ranking …