Bias and debias in recommender system: A survey and future directions

J Chen, H Dong, X Wang, F Feng, M Wang… - ACM Transactions on …, 2023 - dl.acm.org
While recent years have witnessed a rapid growth of research papers on recommender
system (RS), most of the papers focus on inventing machine learning models to better fit …

On the opportunities and challenges of offline reinforcement learning for recommender systems

X Chen, S Wang, J McAuley, D Jannach… - ACM Transactions on …, 2024 - dl.acm.org
Reinforcement learning serves as a potent tool for modeling dynamic user interests within
recommender systems, garnering increasing research attention of late. However, a …

Off-policy actor-critic for recommender systems

M Chen, C Xu, V Gatto, D Jain, A Kumar… - Proceedings of the 16th …, 2022 - dl.acm.org
Industrial recommendation platforms are increasingly concerned with how to make
recommendations that cause users to enjoy their long term experience on the platform …

Pessimistic reward models for off-policy learning in recommendation

O Jeunen, B Goethals - Proceedings of the 15th ACM Conference on …, 2021 - dl.acm.org
Methods for bandit learning from user interactions often require a model of the reward a
certain context-action pair will yield–for example, the probability of a click on a …

Pessimistic decision-making for recommender systems

O Jeunen, B Goethals - ACM Transactions on Recommender Systems, 2023 - dl.acm.org
Modern recommender systems are often modelled under the sequential decision-making
paradigm, where the system decides which recommendations to show in order to maximise …

Counteracting user attention bias in music streaming recommendation via reward modification

X Zhang, S Dai, J Xu, Z Dong, Q Dai… - Proceedings of the 28th …, 2022 - dl.acm.org
In streaming media applications, like music Apps, songs are recommended in a continuous
way in users' daily life. The recommended songs are played automatically although users …

BLOB: A probabilistic model for recommendation that combines organic and bandit signals

O Sakhi, S Bonner, D Rohde, F Vasile - Proceedings of the 26th ACM …, 2020 - dl.acm.org
A common task for recommender systems is to build a profile of the interests of a user from
items in their browsing history and later to recommend items to the user from the same …

Top-k contextual bandits with equity of exposure

O Jeunen, B Goethals - Proceedings of the 15th ACM Conference on …, 2021 - dl.acm.org
The contextual bandit paradigm provides a general framework for decision-making under
uncertainty. It is theoretically well-defined and well-studied, and many personalisation use …

Off-Policy Learning-to-Bid with AuctionGym

O Jeunen, S Murphy, B Allison - Proceedings of the 29th ACM SIGKDD …, 2023 - dl.acm.org
Online advertising opportunities are sold through auctions, billions of times every day across
the web. Advertisers who participate in those auctions need to decide on a bidding strategy …

Practical counterfactual policy learning for top-k recommendations

Y Liu, JN Yen, B Yuan, R Shi, P Yan… - Proceedings of the 28th …, 2022 - dl.acm.org
For building recommender systems, a critical task is to learn a policy with collected feedback
(eg, ratings, clicks) to decide which items to be recommended to users. However, it has been …