Decomposed soft actor-critic method for cooperative multi-agent reinforcement learning Y Pu, S Wang, R Yang, X Yao, B Li arXiv preprint arXiv:2104.06655, 2021 | 17 | 2021 |
LightZero: A unified benchmark for monte carlo tree search in general sequential decision scenarios Y Niu, Y Pu, Z Yang, X Li, T Zhou, J Ren, S Hu, H Li, Y Liu Advances in Neural Information Processing Systems 36, 2024 | 9 | 2024 |
Boltzmann Exploration for Deterministic Policy Optimization S Wang, Y Pu, S Yang, X Yao, B Li Neural Information Processing: 27th International Conference, ICONIP 2020 …, 2020 | 4 | 2020 |
Context-based soft actor critic for environments with non-stationary dynamics Y Pu, S Wang, X Yao, B Li arXiv preprint arXiv:2105.03310, 2021 | 2 | 2021 |
ReZero: Boosting MCTS-based Algorithms by Just-in-Time and Speedy Reanalyze C Xuan, Y Niu, Y Pu, S Hu, J Yang arXiv preprint arXiv:2404.16364, 2024 | 1 | 2024 |
UniZero: Generalized and Efficient Planning with Scalable Latent World Models Y Pu, Y Niu, J Ren, Z Yang, H Li, Y Liu arXiv preprint arXiv:2406.10667, 2024 | | 2024 |
Unifying Diverse Decision-Making Scenarios with Learned Discrete Actions Y Niu, Y Pu, Y Chen, C Xuan, Z Yang, Y Liu, H Li | | |
Neural Discrete Reinforcement Learning Y Niu, Y Pu, C Li, Z Yang, H Li, Y Liu | | |