Accelerating approximate thompson sampling with underdamped langevin monte carlo
Abstract Approximate Thompson sampling with Langevin Monte Carlo broadens its reach
from Gaussian posterior sampling to encompass more general smooth posteriors. However …
from Gaussian posterior sampling to encompass more general smooth posteriors. However …
Finite-time frequentist regret bounds of multi-agent thompson sampling on sparse hypergraphs
We study the multi-agent multi-armed bandit (MAMAB) problem, where agents are factored
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …
into overlapping groups. Each group represents a hyperedge, forming a hypergraph over …
Epsilon-Greedy Thompson Sampling to Bayesian Optimization
Bayesian optimization (BO) has become a powerful tool for solving simulation-based
engineering optimization problems thanks to its ability to integrate physical and …
engineering optimization problems thanks to its ability to integrate physical and …
Efficient and robust sequential decision making algorithms
P Xu - AI Magazine, 2024 - Wiley Online Library
Sequential decision‐making involves making informed decisions based on continuous
interactions with a complex environment. This process is ubiquitous in various applications …
interactions with a complex environment. This process is ubiquitous in various applications …
{\epsilon}-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment
Deep Brain Stimulation (DBS) stands as an effective intervention for alleviating the motor
symptoms of Parkinson's disease (PD). Traditional commercial DBS devices are only able to …
symptoms of Parkinson's disease (PD). Traditional commercial DBS devices are only able to …
Only pay for what is uncertain: Variance-adaptive thompson sampling
Most bandit algorithms assume that the reward variances or their upper bounds are known,
and that they are the same for all arms. This naturally leads to suboptimal performance and …
and that they are the same for all arms. This naturally leads to suboptimal performance and …
Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning
We present the first study on provably efficient randomized exploration in cooperative multi-
agent reinforcement learning (MARL). We propose a unified algorithm framework for …
agent reinforcement learning (MARL). We propose a unified algorithm framework for …
ϵ-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment
Deep Brain Stimulation (DBS) stands as an effective intervention for alleviating the motor
symptoms of Parkinson's disease (PD). Traditional commercial DBS devices are only able to …
symptoms of Parkinson's disease (PD). Traditional commercial DBS devices are only able to …
Joint User Association and Pairing in Multi-UAV-Assisted NOMA Networks: A Decaying-Epsilon Thompson Sampling Framework
Unmanned aerial vehicles (UAVs) are expected to be integrated into future wireless
networks to offer services, especially in unreachable or congested areas. To improve the …
networks to offer services, especially in unreachable or congested areas. To improve the …
The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models
J Lee, CK Chiang, M Sugiyama - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Thompson sampling (TS) has been known for its outstanding empirical performance
supported by theoretical guarantees across various reward models in the classical …
supported by theoretical guarantees across various reward models in the classical …