Meta clustering of neural bandits

Y Ban, Y Qi, T Wei, L Liu, J He - Proceedings of the 30th ACM SIGKDD …, 2024 - dl.acm.org
The contextual bandit has been identified as a powerful framework to formulate the
recommendation process as a sequential decision-making process, where each item is …

System-2 Recommenders: Disentangling Utility and Engagement in Recommendation Systems via Temporal Point-Processes

A Agarwal, N Usunier, A Lazaric, M Nickel - The 2024 ACM Conference …, 2024 - dl.acm.org
Recommender systems are an important part of the modern human experience whose
influence ranges from the food we eat to the news we read. Yet, there is still debate as to …

Long-term Off-Policy Evaluation and Learning

Y Saito, H Abdollahpouri, J Anderton… - Proceedings of the …, 2024 - dl.acm.org
Short-and long-term outcomes of an algorithm often differ, with damaging downstream
effects. A known example is a click-bait algorithm, which may increase short-term clicks but …

Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications

L Xu, L Wang, H Xie, M Zhou - Pacific Rim International Conference on …, 2024 - Springer
Contextual bandits serve as a fundamental algorithmic framework for optimizing
recommendation decisions online. Though extensive attention has been paid to tailoring …

Maximizing success rate of payment routing using non-stationary bandits

A Chaudhary, A Rai, A Gupta - … of the Third International Conference on …, 2023 - dl.acm.org
This paper discusses the system architecture design and deployment of non-stationary multi-
armed bandit approaches to determine a near-optimal payment routing policy based on the …

Evaluating and Utilizing Surrogate Outcomes in Covariate-Adjusted Response-Adaptive Designs

W Zhang, A Hudson, M Petersen… - arXiv preprint arXiv …, 2024 - arxiv.org
This manuscript explores the intersection of surrogate outcomes and adaptive designs in
statistical research. While surrogate outcomes have long been studied for their potential to …

Predicting Long Term Sequential Policy Value Using Softer Surrogates

H Nam, A Nie, G Gao, V Syrgkanis… - arXiv preprint arXiv …, 2024 - arxiv.org
Performing policy evaluation in education, healthcare and online commerce can be
challenging, because it can require waiting substantial amounts of time to observe outcomes …

Neural Contextual Bandits for Personalized Recommendation

Y Ban, Y Qi, J He - Companion Proceedings of the ACM on Web …, 2024 - dl.acm.org
In the dynamic landscape of online businesses, recommender systems are pivotal in
enhancing user experiences. While traditional approaches have relied on static supervised …

Short-Long Policy Evaluation with Novel Actions

HA Nam, Y Chandak, E Brunskill - arXiv preprint arXiv:2407.03674, 2024 - arxiv.org
From incorporating LLMs in education, to identifying new drugs and improving ways to
charge batteries, innovators constantly try new strategies in search of better long-term …

The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

O Besbes, Y Kanoria, A Kumar - arXiv preprint arXiv:2405.03948, 2024 - arxiv.org
Recommendation systems are widespread, and through customized recommendations,
promise to match users with options they will like. To that end, data on engagement is …