[图书][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
Fairness of exposure in rankings
A Singh, T Joachims - Proceedings of the 24th ACM SIGKDD …, 2018 - dl.acm.org
Rankings are ubiquitous in the online world today. As we have transitioned from finding
books in libraries to ranking products, jobs, job applicants, opinions and potential romantic …
books in libraries to ranking products, jobs, job applicants, opinions and potential romantic …
Introduction to multi-armed bandits
A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …
decisions over time under uncertainty. An enormous body of work has accumulated over the …
Controlling fairness and bias in dynamic learning-to-rank
Rankings are the primary interface through which many online platforms match users to
items (eg news, products, music, video). In these two-sided markets, not only the users draw …
items (eg news, products, music, video). In these two-sided markets, not only the users draw …
Ranking with fairness constraints
LE Celis, D Straszak, NK Vishnoi - arXiv preprint arXiv:1704.06840, 2017 - arxiv.org
Ranking algorithms are deployed widely to order a set of items in applications such as
search engines, news feeds, and recommendation systems. Recent studies, however, have …
search engines, news feeds, and recommendation systems. Recent studies, however, have …
Evaluating stochastic rankings with expected exposure
We introduce the concept of expected exposure as the average attention ranked items
receive from users over repeated samples of the same query. Furthermore, we advocate for …
receive from users over repeated samples of the same query. Furthermore, we advocate for …
Explore, exploit, and explain: personalizing explainable recommendations with bandits
J McInerney, B Lacker, S Hansen, K Higley… - Proceedings of the 12th …, 2018 - dl.acm.org
The multi-armed bandit is an important framework for balancing exploration with exploitation
in recommendation. Exploitation recommends content (eg, products, movies, music playlists) …
in recommendation. Exploitation recommends content (eg, products, movies, music playlists) …
Policy learning for fairness in ranking
A Singh, T Joachims - Advances in neural information …, 2019 - proceedings.neurips.cc
Abstract Conventional Learning-to-Rank (LTR) methods optimize the utility of the rankings to
the users, but they are oblivious to their impact on the ranked items. However, there has …
the users, but they are oblivious to their impact on the ranked items. However, there has …
Determinantal point processes for machine learning
Determinantal point processes (DPPs) are elegant probabilistic models of repulsion that
arise in quantum physics and random matrix theory. In contrast to traditional structured …
arise in quantum physics and random matrix theory. In contrast to traditional structured …
Measuring the business value of recommender systems
Recommender Systems are nowadays successfully used by all major web sites—from e-
commerce to social media—to filter content and make suggestions in a personalized way …
commerce to social media—to filter content and make suggestions in a personalized way …