Pgx: Hardware-accelerated parallel game simulators for reinforcement learning S Koyamada, S Okano, S Nishimori, Y Murata, K Habara, H Kita, S Ishii Advances in Neural Information Processing Systems 36, 2024 | 18 | 2024 |
Mjx: A framework for Mahjong AI research S Koyamada, K Habara, N Goto, S Okano, S Nishimori, S Ishii 2022 IEEE Conference on Games (CoG), 504-507, 2022 | 3 | 2022 |
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees T Kitamura, T Kozuno, M Kato, Y Ichihara, S Nishimori, A Sannai, ... arXiv preprint arXiv:2401.17780, 2024 | 2 | 2024 |
A Batch Sequential Halving Algorithm without Performance Degradation S Koyamada, S Nishimori, S Ishii arXiv preprint arXiv:2406.00424, 2024 | | 2024 |
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains S Nishimori, XQ Cai, J Ackermann, M Sugiyama arXiv preprint arXiv:2404.07465, 2024 | | 2024 |
JAX-CORL: Clean Sigle-file Implementations of Offline RL Algorithms in JAX S Nishimori https://github.com/nissymori/JAX-CORL, 2024 | | 2024 |
End-to-End Policy Gradient Method for POMDPs and Explainable Agents S Nishimori, S Koyamada, S Ishii arXiv preprint arXiv:2304.09769, 2023 | | 2023 |