Efficient frameworks for generalized low-rank matrix bandit problems

Y Kang, CJ Hsieh, TCM Lee - Advances in Neural …, 2022 - proceedings.neurips.cc
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action
is given by the inner product between the action's feature matrix and some fixed, but initially …

An analysis of ensemble sampling

C Qin, Z Wen, X Lu, B Van Roy - Advances in Neural …, 2022 - proceedings.neurips.cc
Ensemble sampling serves as a practical approximation to Thompson sampling when
maintaining an exact posterior distribution over model parameters is computationally …

Optimal algorithms for latent bandits with cluster structure

S Pal, AS Suggala, K Shanmugam… - … Conference on Artificial …, 2023 - proceedings.mlr.press
We consider the problem of latent bandits with cluster structure where there are multiple
users, each with an associated multi-armed bandit problem. These users are grouped into …

Optimal gradient-based algorithms for non-concave bandit optimization

B Huang, K Huang, S Kakade, JD Lee… - Advances in …, 2021 - proceedings.neurips.cc
Bandit problems with linear or concave reward have been extensively studied, but relatively
few works have studied bandits with non-concave reward. This work considers a large family …

Online low rank matrix completion

P Jain, S Pal - arXiv preprint arXiv:2209.03997, 2022 - arxiv.org
We study the problem of {\em online} low-rank matrix completion with $\mathsf {M} $ users,
$\mathsf {N} $ items and $\mathsf {T} $ rounds. In each round, the algorithm recommends …

Speed up the cold-start learning in two-sided bandits with many arms

M Bayati, J Cao, W Chen - arXiv preprint arXiv:2210.00340, 2022 - arxiv.org
Multi-armed bandit (MAB) algorithms are efficient approaches to reduce the opportunity cost
of online experimentation and are used by companies to find the best product from …

Targeted advertising on social networks using online variational tensor regression

T Idé, K Murugesan, D Bouneffouf, N Abe - arXiv preprint arXiv …, 2022 - arxiv.org
This paper is concerned with online targeted advertising on social networks. The main
technical task we address is to estimate the activation probability for user pairs, which …

Online matrix completion: A collaborative approach with hott items

D Baby, S Pal - arXiv preprint arXiv:2408.05843, 2024 - arxiv.org
We investigate the low rank matrix completion problem in an online setting with ${M} $
users, ${N} $ items, ${T} $ rounds, and an unknown rank-$ r $ reward matrix ${R}\in\mathbb …

Online Low Rank Matrix Completion

S Pal, P Jain - The Eleventh International Conference on Learning …, 2022 - openreview.net
We study the problem of online low-rank matrix completion with $\mathsf {M} $ users,
$\mathsf {N} $ items and $\mathsf {T} $ rounds. In each round, the algorithm recommends …

Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems

Y Kang, CJ Hsieh, T Lee - arXiv preprint arXiv:2401.07298, 2024 - arxiv.org
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action
is given by the inner product between the action's feature matrix and some fixed, but initially …