Statistical inference on multi-armed bandits with delayed feedback
Multi armed bandit (MAB) algorithms have been increasingly used to complement or
integrate with A/B tests and randomized clinical trials in e-commerce, healthcare, and …
integrate with A/B tests and randomized clinical trials in e-commerce, healthcare, and …
Adaptive linear estimating equations
Sequential data collection has emerged as a widely adopted technique for enhancing the
efficiency of data gathering processes. Despite its advantages, such data collection …
efficiency of data gathering processes. Despite its advantages, such data collection …
Off-policy evaluation beyond overlap: partial identification through smoothness
Off-policy evaluation (OPE) is the problem of estimating the value of a target policy using
historical data collected under a different logging policy. OPE methods typically assume …
historical data collected under a different logging policy. OPE methods typically assume …
Causal reinforcement learning: An instrumental variable approach
In the standard data analysis framework, data is first collected (once for all), and then data
analysis is carried out. Moreover, the data-generating process is typically assumed to be …
analysis is carried out. Moreover, the data-generating process is typically assumed to be …
[PDF][PDF] Statistical inference after adaptive sampling in non-markovian environments
There is a great desire to use adaptive sampling methods, such as reinforcement learning
(RL) and bandit algorithms, for the real-time personalization of interventions in digital …
(RL) and bandit algorithms, for the real-time personalization of interventions in digital …
Battling the coronavirus 'infodemic'among social media users in Kenya and Nigeria
How can we induce social media users to be discerning when sharing information during a
pandemic? An experiment on Facebook Messenger with users from Kenya (n= 7,498) and …
pandemic? An experiment on Facebook Messenger with users from Kenya (n= 7,498) and …
Counterfactual inference for sequential experiments
We consider after-study statistical inference for sequentially designed experiments wherein
multiple units are assigned treatments for multiple time points using treatment policies that …
multiple units are assigned treatments for multiple time points using treatment policies that …
Uncertainty-aware instance reweighting for off-policy learning
Off-policy learning, referring to the procedure of policy optimization with access only to
logged feedback data, has shown importance in various important real-world applications …
logged feedback data, has shown importance in various important real-world applications …
Double/debiased machine learning for dynamic treatment effects via g-estimation
G Lewis, V Syrgkanis - arXiv preprint arXiv:2002.07285, 2020 - arxiv.org
We consider the estimation of treatment effects in settings when multiple treatments are
assigned over time and treatments can have a causal effect on future outcomes or the state …
assigned over time and treatments can have a causal effect on future outcomes or the state …
Statistical limits of adaptive linear models: low-dimensional estimation and inference
Estimation and inference in statistics pose significant challenges when data are collected
adaptively. Even in linear models, the Ordinary Least Squares (OLS) estimator may fail to …
adaptively. Even in linear models, the Ordinary Least Squares (OLS) estimator may fail to …