Meta clustering of neural bandits
The contextual bandit has been identified as a powerful framework to formulate the
recommendation process as a sequential decision-making process, where each item is …
recommendation process as a sequential decision-making process, where each item is …
System-2 Recommenders: Disentangling Utility and Engagement in Recommendation Systems via Temporal Point-Processes
Recommender systems are an important part of the modern human experience whose
influence ranges from the food we eat to the news we read. Yet, there is still debate as to …
influence ranges from the food we eat to the news we read. Yet, there is still debate as to …
Long-term Off-Policy Evaluation and Learning
Short-and long-term outcomes of an algorithm often differ, with damaging downstream
effects. A known example is a click-bait algorithm, which may increase short-term clicks but …
effects. A known example is a click-bait algorithm, which may increase short-term clicks but …
Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications
L Xu, L Wang, H Xie, M Zhou - Pacific Rim International Conference on …, 2024 - Springer
Contextual bandits serve as a fundamental algorithmic framework for optimizing
recommendation decisions online. Though extensive attention has been paid to tailoring …
recommendation decisions online. Though extensive attention has been paid to tailoring …
Maximizing success rate of payment routing using non-stationary bandits
This paper discusses the system architecture design and deployment of non-stationary multi-
armed bandit approaches to determine a near-optimal payment routing policy based on the …
armed bandit approaches to determine a near-optimal payment routing policy based on the …
Evaluating and Utilizing Surrogate Outcomes in Covariate-Adjusted Response-Adaptive Designs
This manuscript explores the intersection of surrogate outcomes and adaptive designs in
statistical research. While surrogate outcomes have long been studied for their potential to …
statistical research. While surrogate outcomes have long been studied for their potential to …
Predicting Long Term Sequential Policy Value Using Softer Surrogates
Performing policy evaluation in education, healthcare and online commerce can be
challenging, because it can require waiting substantial amounts of time to observe outcomes …
challenging, because it can require waiting substantial amounts of time to observe outcomes …
Neural Contextual Bandits for Personalized Recommendation
In the dynamic landscape of online businesses, recommender systems are pivotal in
enhancing user experiences. While traditional approaches have relied on static supervised …
enhancing user experiences. While traditional approaches have relied on static supervised …
Short-Long Policy Evaluation with Novel Actions
HA Nam, Y Chandak, E Brunskill - arXiv preprint arXiv:2407.03674, 2024 - arxiv.org
From incorporating LLMs in education, to identifying new drugs and improving ways to
charge batteries, innovators constantly try new strategies in search of better long-term …
charge batteries, innovators constantly try new strategies in search of better long-term …
The Fault in Our Recommendations: On the Perils of Optimizing the Measurable
Recommendation systems are widespread, and through customized recommendations,
promise to match users with options they will like. To that end, data on engagement is …
promise to match users with options they will like. To that end, data on engagement is …