Bias and debias in recommender system: A survey and future directions
While recent years have witnessed a rapid growth of research papers on recommender
system (RS), most of the papers focus on inventing machine learning models to better fit …
system (RS), most of the papers focus on inventing machine learning models to better fit …
On the opportunities and challenges of offline reinforcement learning for recommender systems
Reinforcement learning serves as a potent tool for modeling dynamic user interests within
recommender systems, garnering increasing research attention of late. However, a …
recommender systems, garnering increasing research attention of late. However, a …
Off-policy actor-critic for recommender systems
Industrial recommendation platforms are increasingly concerned with how to make
recommendations that cause users to enjoy their long term experience on the platform …
recommendations that cause users to enjoy their long term experience on the platform …
Pessimistic reward models for off-policy learning in recommendation
O Jeunen, B Goethals - Proceedings of the 15th ACM Conference on …, 2021 - dl.acm.org
Methods for bandit learning from user interactions often require a model of the reward a
certain context-action pair will yield–for example, the probability of a click on a …
certain context-action pair will yield–for example, the probability of a click on a …
Pessimistic decision-making for recommender systems
O Jeunen, B Goethals - ACM Transactions on Recommender Systems, 2023 - dl.acm.org
Modern recommender systems are often modelled under the sequential decision-making
paradigm, where the system decides which recommendations to show in order to maximise …
paradigm, where the system decides which recommendations to show in order to maximise …
Counteracting user attention bias in music streaming recommendation via reward modification
In streaming media applications, like music Apps, songs are recommended in a continuous
way in users' daily life. The recommended songs are played automatically although users …
way in users' daily life. The recommended songs are played automatically although users …
BLOB: A probabilistic model for recommendation that combines organic and bandit signals
A common task for recommender systems is to build a profile of the interests of a user from
items in their browsing history and later to recommend items to the user from the same …
items in their browsing history and later to recommend items to the user from the same …
Top-k contextual bandits with equity of exposure
O Jeunen, B Goethals - Proceedings of the 15th ACM Conference on …, 2021 - dl.acm.org
The contextual bandit paradigm provides a general framework for decision-making under
uncertainty. It is theoretically well-defined and well-studied, and many personalisation use …
uncertainty. It is theoretically well-defined and well-studied, and many personalisation use …
Off-Policy Learning-to-Bid with AuctionGym
O Jeunen, S Murphy, B Allison - Proceedings of the 29th ACM SIGKDD …, 2023 - dl.acm.org
Online advertising opportunities are sold through auctions, billions of times every day across
the web. Advertisers who participate in those auctions need to decide on a bidding strategy …
the web. Advertisers who participate in those auctions need to decide on a bidding strategy …
Practical counterfactual policy learning for top-k recommendations
For building recommender systems, a critical task is to learn a policy with collected feedback
(eg, ratings, clicks) to decide which items to be recommended to users. However, it has been …
(eg, ratings, clicks) to decide which items to be recommended to users. However, it has been …