Statistical challenges in online controlled experiments: A review of a/b testing methodology

N Larsen, J Stallrich, S Sengupta, A Deng… - The American …, 2024 - Taylor & Francis
The rise of internet-based services and products in the late 1990s brought about an
unprecedented opportunity for online businesses to engage in large scale data-driven …

Game-theoretic statistics and safe anytime-valid inference

A Ramdas, P Grünwald, V Vovk, G Shafer - Statistical Science, 2023 - projecteuclid.org
Safe anytime-valid inference (SAVI) provides measures of statistical evidence and certainty—
e-processes for testing and confidence sequences for estimation—that remain valid at all …

[PDF][PDF] Introduction

RRM Coleman - Say It Loud!, 2013 - library.oapen.org
Now we demand a chance to do things for ourself We're tired ofbeatin our head against the
wall Say it loud, I'm Black and I'm proud.—James Brown," Say It Loud" The past three …

The anytime-valid logrank test: Error control under continuous monitoring with unlimited horizon

J ter Schure, MF Pérez-Ortiz, A Ly… - The New England …, 2024 - nejsds.nestat.org
We introduce the anytime-valid (AV) logrank test, a version of the logrank test that provides
type-I error guarantees under optional stopping and optional continuation. The test is …

Near-optimal non-parametric sequential tests and confidence sequences with possibly dependent observations

A Bibaut, N Kallus, M Lindon - arXiv preprint arXiv:2212.14411, 2022 - arxiv.org
Sequential tests and their implied confidence sequences, which are valid at arbitrary
stopping times, promise flexible statistical inference and on-the-fly decision making …

Evidential calibration of confidence intervals

S Pawel, A Ly, EJ Wagenmakers - The American Statistician, 2024 - Taylor & Francis
We present a novel and easy-to-use method for calibrating error-rate based confidence
intervals to evidence-based support intervals. Support intervals are obtained from inverting …

Rapid regression detection in software deployments through sequential testing

M Lindon, C Sanden, V Shirikian - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org
The practice of continuous deployment has enabled companies to reduce time-to-market by
increasing the rate at which software can be deployed. However, deploying more frequently …

[HTML][HTML] Generic e-variables for exact sequential k-sample tests that allow for optional stopping

RJ Turner, A Ly, PD Grünwald - Journal of Statistical Planning and …, 2024 - Elsevier
We develop E-variables for testing whether two or more data streams come from the same
source or not, and more generally, whether the difference between the sources is larger than …

Safe sequential testing and effect estimation in stratified count data

R Turner, P Grunwald - International Conference on Artificial …, 2023 - proceedings.mlr.press
Sequential decision making significantly speeds up research and is more cost-effective
compared to fixed-n methods. We present a method for sequential decision making for …

Data Drift Monitoring for Log Anomaly Detection Pipelines

D Wani, S Ackerman, E Farchi, X Liu, H Chang… - arXiv preprint arXiv …, 2023 - arxiv.org
Logs enable the monitoring of infrastructure status and the performance of associated
applications. Logs are also invaluable for diagnosing the root causes of any problems that …