Thresholded lasso bandit K Ariu, K Abe, A Proutière International Conference on Machine Learning (ICML 2022), 2022 | 24 | 2022 |
Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games K Abe, Y Kaneko International Conference on Autonomous Agents and Multiagent Systems (AAMAS …, 2021 | 18* | 2021 |
Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games K Abe, M Sakamoto, A Iwasaki Conference on Uncertainty in Artificial Intelligence (UAI 2022), 2022 | 15 | 2022 |
Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games K Abe, K Ariu, M Sakamoto, K Toyoshima, A Iwasaki International Conference on Artificial Intelligence and Statistics (AISTATS …, 2023 | 12 | 2023 |
Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search K Abe, J Komiyama, A Iwasaki International Joint Conference on Artificial Intelligence (IJCAI 2022), 2022 | 10 | 2022 |
Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium Y Fujimoto, K Ariu, K Abe International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023 | 3 | 2023 |
A practical guide of off-policy evaluation for bandit problems M Kato, K Abe, K Ariu, S Yasui arXiv preprint arXiv:2010.12470, 2020 | 3 | 2020 |
Online Learning for Bidding Agent in First Price Auction G Morishita, K Abe, K Ogawa, Y Kaneko AAAI-20 Workshop on Reinforcement Learning in Games, 2020 | 3 | 2020 |
Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systems R Togashi, K Abe, Y Saito The Web Conference (WWW 2024), 2024 | 2* | 2024 |
Learning Fair Division from Bandit Feedback H Yamada, J Komiyama, K Abe, A Iwasaki International Conference on Artificial Intelligence and Statistics (AISTATS …, 2024 | 2 | 2024 |
Filtered direct preference optimization T Morimura, M Sakamoto, Y Jinnai, K Abe, K Air arXiv preprint arXiv:2404.13846, 2024 | 2 | 2024 |
Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games Y Fujimoto, K Ariu, K Abe Annual AAAI Conference on Artificial Intelligence (AAAI 2024), 2024 | 2 | 2024 |
Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processes T Morimura, K Ota, K Abe, P Zhang arXiv preprint arXiv:2206.01011, 2022 | 2 | 2022 |
Model-based minimum bayes risk decoding Y Jinnai, T Morimura, U Honda, K Ariu, K Abe International Conference on Machine Learning (ICML 2024), 2024 | 1 | 2024 |
Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment Y Jinnai, T Morimura, K Ariu, K Abe arXiv preprint arXiv:2404.01054, 2024 | 1 | 2024 |
クールノー競争におけるマルチエージェント強化学習に関する研究 豊島健太郎, 坂本充生, 阿部拳之, 岩崎敦 第 84 回全国大会講演論文集 2022 (1), 11-12, 2022 | 1 | 2022 |
Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization M Kato, K Nakagawa, K Abe, T Morimura arXiv preprint arXiv:2010.01404, 2020 | 1* | 2020 |
A simple heuristic for Bayesian optimization with a low budget M Nomura, K Abe arXiv preprint arXiv:1911.07790, 2019 | 1 | 2019 |
Adaptively Perturbed Mirror Descent for Learning in Games K Abe, K Ariu, M Sakamoto, A Iwasaki International Conference on Machine Learning (ICML 2024), 2024 | | 2024 |
Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry Y Fujimoto, K Ariu, K Abe arXiv preprint arXiv:2405.14546, 2024 | | 2024 |