A survey of online experiment design with the stochastic multi-armed bandit

G Burtini, J Loeppky, R Lawrence - arXiv preprint arXiv:1510.00757, 2015 - arxiv.org
Adaptive and sequential experiment design is a well-studied area in numerous domains. We
survey and synthesize the work of the online statistical learning paradigm referred to as multi …

Regret analysis of bandit problems with causal background knowledge

Y Lu, A Meisami, A Tewari… - … on Uncertainty in Artificial …, 2020 - proceedings.mlr.press
We study how to learn optimal interventions sequentially given causal information
represented as a causal graph along with associated conditional distributions. Causal …

Multi-armed bandit problem with known trend

D Bouneffouf, R Féraud - Neurocomputing, 2016 - Elsevier
We consider a variant of the multi-armed bandit model, which we call multi-armed bandit
problem with known trend, where the gambler knows the shape of the reward function of …

Ensemble recommendations via thompson sampling: an experimental study within e-commerce

B Brodén, M Hammar, BJ Nilsson… - Proceedings of the 23rd …, 2018 - dl.acm.org
This work presents an extension of Thompson Sampling bandit policy for orchestrating the
collection of base recommendation algorithms for e-commerce. We focus on the problem of …

Multi-armed bandits for sleep recognition of elderly living in single-resident smart homes

ZK Shahid, S Saguna, C Åhlund - IEEE Internet of Things …, 2023 - ieeexplore.ieee.org
Sleep is an essential activity that affects an individual's health and ability to perform activities
of daily living (ADL). Inadequate sleep reduces cognitive capacity and leads to health …

A systematic literature review of solutions for cold start problem

N Singh, SK Singh - … Journal of System Assurance Engineering and …, 2024 - Springer
Insufficient knowledge about a new bug or a new developer, in the context of
recommendations done in software bug repositories (SBR) mining, impacts the …

Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling

Z Zhu, Y Liu, X Kuang, B Van Roy - arXiv preprint arXiv:2310.07786, 2023 - arxiv.org
Real-world applications of contextual bandits often exhibit non-stationarity due to
seasonality, serendipity, and evolving social trends. While a number of non-stationary …

A definition of non-stationary bandits

Y Liu, X Kuang, B Van Roy - arXiv preprint arXiv:2302.12202, 2023 - arxiv.org
Despite the subject of non-stationary bandit learning having attracted much recent attention,
we have yet to identify a formal definition of non-stationarity that can consistently distinguish …

Feature-based and adaptive rule adaptation in dynamic environments

A Tabebordbar, A Beheshti, B Benatallah… - Data Science and …, 2020 - Springer
Rule-based systems have been used increasingly to augment learning algorithms for
annotating data. Rules alleviate many of the shortcomings inherent in pure algorithmic …

Adaptive rule adaptation in unstructured and dynamic environments

A Tabebordbar, A Beheshti, B Benatallah… - … Engineering–WISE 2019 …, 2019 - Springer
Rule-based systems have been used to augment machine learning based algorithms for
annotating data in unstructured and dynamic environments. Rules can alleviate many of …