A survey of online experiment design with the stochastic multi-armed bandit
Adaptive and sequential experiment design is a well-studied area in numerous domains. We
survey and synthesize the work of the online statistical learning paradigm referred to as multi …
survey and synthesize the work of the online statistical learning paradigm referred to as multi …
Regret analysis of bandit problems with causal background knowledge
We study how to learn optimal interventions sequentially given causal information
represented as a causal graph along with associated conditional distributions. Causal …
represented as a causal graph along with associated conditional distributions. Causal …
Multi-armed bandit problem with known trend
D Bouneffouf, R Féraud - Neurocomputing, 2016 - Elsevier
We consider a variant of the multi-armed bandit model, which we call multi-armed bandit
problem with known trend, where the gambler knows the shape of the reward function of …
problem with known trend, where the gambler knows the shape of the reward function of …
Ensemble recommendations via thompson sampling: an experimental study within e-commerce
B Brodén, M Hammar, BJ Nilsson… - Proceedings of the 23rd …, 2018 - dl.acm.org
This work presents an extension of Thompson Sampling bandit policy for orchestrating the
collection of base recommendation algorithms for e-commerce. We focus on the problem of …
collection of base recommendation algorithms for e-commerce. We focus on the problem of …
Multi-armed bandits for sleep recognition of elderly living in single-resident smart homes
Sleep is an essential activity that affects an individual's health and ability to perform activities
of daily living (ADL). Inadequate sleep reduces cognitive capacity and leads to health …
of daily living (ADL). Inadequate sleep reduces cognitive capacity and leads to health …
A systematic literature review of solutions for cold start problem
N Singh, SK Singh - … Journal of System Assurance Engineering and …, 2024 - Springer
Insufficient knowledge about a new bug or a new developer, in the context of
recommendations done in software bug repositories (SBR) mining, impacts the …
recommendations done in software bug repositories (SBR) mining, impacts the …
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Real-world applications of contextual bandits often exhibit non-stationarity due to
seasonality, serendipity, and evolving social trends. While a number of non-stationary …
seasonality, serendipity, and evolving social trends. While a number of non-stationary …
A definition of non-stationary bandits
Despite the subject of non-stationary bandit learning having attracted much recent attention,
we have yet to identify a formal definition of non-stationarity that can consistently distinguish …
we have yet to identify a formal definition of non-stationarity that can consistently distinguish …
Feature-based and adaptive rule adaptation in dynamic environments
Rule-based systems have been used increasingly to augment learning algorithms for
annotating data. Rules alleviate many of the shortcomings inherent in pure algorithmic …
annotating data. Rules alleviate many of the shortcomings inherent in pure algorithmic …
Adaptive rule adaptation in unstructured and dynamic environments
Rule-based systems have been used to augment machine learning based algorithms for
annotating data in unstructured and dynamic environments. Rules can alleviate many of …
annotating data in unstructured and dynamic environments. Rules can alleviate many of …