关注
Soichiro Nishimori
Soichiro Nishimori
在 g.ecc.u-tokyo.ac.jp 的电子邮件经过验证
标题
引用次数
引用次数
年份
Pgx: Hardware-accelerated parallel game simulators for reinforcement learning
S Koyamada, S Okano, S Nishimori, Y Murata, K Habara, H Kita, S Ishii
Advances in Neural Information Processing Systems 36, 2024
182024
Mjx: A framework for Mahjong AI research
S Koyamada, K Habara, N Goto, S Okano, S Nishimori, S Ishii
2022 IEEE Conference on Games (CoG), 504-507, 2022
32022
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
T Kitamura, T Kozuno, M Kato, Y Ichihara, S Nishimori, A Sannai, ...
arXiv preprint arXiv:2401.17780, 2024
22024
A Batch Sequential Halving Algorithm without Performance Degradation
S Koyamada, S Nishimori, S Ishii
arXiv preprint arXiv:2406.00424, 2024
2024
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains
S Nishimori, XQ Cai, J Ackermann, M Sugiyama
arXiv preprint arXiv:2404.07465, 2024
2024
JAX-CORL: Clean Sigle-file Implementations of Offline RL Algorithms in JAX
S Nishimori
https://github.com/nissymori/JAX-CORL, 2024
2024
End-to-End Policy Gradient Method for POMDPs and Explainable Agents
S Nishimori, S Koyamada, S Ishii
arXiv preprint arXiv:2304.09769, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–7