Towards a Competitive 3-Player Mahjong AI using Deep Reinforcement Learning

X Zhao, SB Holden - 2022 IEEE Conference on Games (CoG), 2022 - ieeexplore.ieee.org
Mahjong is a multi-player imperfect-information game with challenging features for AI
research. Sanma, being a 3-player variant of Japanese Riichi Mahjong, possesses unique …

Synergizing habits and goals with variational Bayes

D Han, K Doya, D Li, J Tani - Nature Communications, 2024 - nature.com
Behaving efficiently and flexibly is crucial for biological and artificial embodied agents.
Behavior is generally classified into two types: habitual (fast but inflexible), and goal-directed …

LsAc ‐MJ: A Low‐Resource Consumption Reinforcement Learning Model for Mahjong Game

X Li, Z Wang, B Liu, J Dai - International Journal of Intelligent …, 2024 - Wiley Online Library
This article proposes a novel Mahjong game model, LsAc∗‐MJ, designed to address
challenges posed by data scarcity, difficulty in leveraging contextual information, and the …

MJ-DLVAT: A Deep Learning Value Assessment Technique for Mahjong

T Ogami, K Amano, Y Tsuruoka - 2024 IEEE Conference on …, 2024 - ieeexplore.ieee.org
In games with stochastic outcomes, evaluating agent performance from limited data is
challenging. Results of Monte Carlo sampling do not provide a reliable indicator due to the …

Habits and goals in synergy: a variational Bayesian framework for behavior

D Han, K Doya, D Li, J Tani - arXiv preprint arXiv:2304.05008, 2023 - arxiv.org
How to behave efficiently and flexibly is a central problem for understanding biological
agents and creating intelligent embodied AI. It has been well known that behavior can be …

抽象化及び推論し行動計画を立案することが可能な認知脳型ロボットの構築に向けて: 階層的で部分観測な環境での強化学習及び能動推論

D Han - 2022 - oist.repo.nii.ac.jp
The thesis aims to advance cognitive decision-making and motor control using
reinforcement learning (RL) with stochastic recurrent neural networks (RNNs). RL is a …

分散減少法を用いた麻雀における実力推定

大神卓也, 天野克敏, 奈良亮耶… - ゲームプログラミングワーク …, 2023 - ipsj.ixsq.nii.ac.jp
論文抄録 麻雀において, プレイヤーの実力を表す指標である平均順位を求める際には,
複数の試合の結果を平均するモンテカルロ法が用いられる. この手法では結果の分散が大きく …