Efficient frameworks for generalized low-rank matrix bandit problems
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action
is given by the inner product between the action's feature matrix and some fixed, but initially …
is given by the inner product between the action's feature matrix and some fixed, but initially …
An analysis of ensemble sampling
Ensemble sampling serves as a practical approximation to Thompson sampling when
maintaining an exact posterior distribution over model parameters is computationally …
maintaining an exact posterior distribution over model parameters is computationally …
Optimal algorithms for latent bandits with cluster structure
We consider the problem of latent bandits with cluster structure where there are multiple
users, each with an associated multi-armed bandit problem. These users are grouped into …
users, each with an associated multi-armed bandit problem. These users are grouped into …
Optimal gradient-based algorithms for non-concave bandit optimization
Bandit problems with linear or concave reward have been extensively studied, but relatively
few works have studied bandits with non-concave reward. This work considers a large family …
few works have studied bandits with non-concave reward. This work considers a large family …
Online low rank matrix completion
We study the problem of {\em online} low-rank matrix completion with $\mathsf {M} $ users,
$\mathsf {N} $ items and $\mathsf {T} $ rounds. In each round, the algorithm recommends …
$\mathsf {N} $ items and $\mathsf {T} $ rounds. In each round, the algorithm recommends …
Speed up the cold-start learning in two-sided bandits with many arms
Multi-armed bandit (MAB) algorithms are efficient approaches to reduce the opportunity cost
of online experimentation and are used by companies to find the best product from …
of online experimentation and are used by companies to find the best product from …
Targeted advertising on social networks using online variational tensor regression
This paper is concerned with online targeted advertising on social networks. The main
technical task we address is to estimate the activation probability for user pairs, which …
technical task we address is to estimate the activation probability for user pairs, which …
Online matrix completion: A collaborative approach with hott items
We investigate the low rank matrix completion problem in an online setting with ${M} $
users, ${N} $ items, ${T} $ rounds, and an unknown rank-$ r $ reward matrix ${R}\in\mathbb …
users, ${N} $ items, ${T} $ rounds, and an unknown rank-$ r $ reward matrix ${R}\in\mathbb …
Online Low Rank Matrix Completion
We study the problem of online low-rank matrix completion with $\mathsf {M} $ users,
$\mathsf {N} $ items and $\mathsf {T} $ rounds. In each round, the algorithm recommends …
$\mathsf {N} $ items and $\mathsf {T} $ rounds. In each round, the algorithm recommends …
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action
is given by the inner product between the action's feature matrix and some fixed, but initially …
is given by the inner product between the action's feature matrix and some fixed, but initially …