Anytime-valid off-policy inference for contextual bandits

I Waudby-Smith, L Wu, A Ramdas… - ACM/JMS Journal of …, 2024 - dl.acm.org
Contextual bandit algorithms are ubiquitous tools for active sequential experimentation in
healthcare and the tech industry. They involve online learning algorithms that adaptively …

Multi-armed bandit experimental design: Online decision-making and adaptive inference

D Simchi-Levi, C Wang - International Conference on …, 2023 - proceedings.mlr.press
Multi-armed bandit has been well-known for its efficiency in online decision-making in terms
of minimizing the loss of the participants' welfare during experiments (ie, the regret). In …

Post-contextual-bandit inference

A Bibaut, M Dimakopoulou, N Kallus… - Advances in neural …, 2021 - proceedings.neurips.cc
Contextual bandit algorithms are increasingly replacing non-adaptive A/B tests in e-
commerce, healthcare, and policymaking because they can both improve outcomes for …

Clip-ogd: An experimental design for adaptive neyman allocation in sequential experiments

J Dai, P Gradu, C Harshaw - Advances in Neural …, 2023 - proceedings.neurips.cc
From clinical development of cancer therapies to investigations into partisan bias, adaptive
sequential designs have become increasingly popular method for causal inference, as they …

A Primer on the Analysis of Randomized Experiments and a Survey of some Recent Advances

Y Bai, AM Shaikh, M Tabord-Meehan - arXiv preprint arXiv:2405.03910, 2024 - arxiv.org
The past two decades have witnessed a surge of new research in the analysis of
randomized experiments. The emergence of this literature may seem surprising given the …

Microrandomized trials: developing just-in-time adaptive interventions for better public health

X Liu, N Deliu, B Chakraborty - American Journal of …, 2023 - ajph.aphapublications.org
Just-in-time adaptive interventions (JITAIs) represent an intervention design that adapts the
provision and type of support over time to an individual's changing status and contexts …

On instance-dependent bounds for offline reinforcement learning with linear function approximation

T Nguyen-Tang, M Yin, S Gupta, S Venkatesh… - Proceedings of the …, 2023 - ojs.aaai.org
Sample-efficient offline reinforcement learning (RL) with linear function approximation has
been studied extensively recently. Much of the prior work has yielded instance-independent …

Adaptive instrument design for indirect experiments

Y Chandak, S Shankar, V Syrgkanis… - The Twelfth International …, 2023 - openreview.net
Indirect experiments provide a valuable framework for estimating treatment effects in
situations where conducting randomized control trials (RCTs) is impractical or unethical …

Adaptive principal component regression with applications to panel data

A Agarwal, K Harris, J Whitehouse… - Advances in Neural …, 2024 - proceedings.neurips.cc
Principal component regression (PCR) is a popular technique for fixed-design error-in-
variables regression, a generalization of the linear regression setting in which the observed …

A lower bound for linear and kernel regression with adaptive covariates

T Lattimore - The Thirty Sixth Annual Conference on …, 2023 - proceedings.mlr.press
We prove that the continuous time version of the concentration bounds by Abbasi-Yadkori et
al.(2011) for adaptive linear regression cannot be improved in general, showing that there …