Large-scale validation of counterfactual learning methods: A test-bed

C Gao, S Li, W Lei, J Chen, B Li, P Jiang, X He… - Proceedings of the 31st …, 2022 - dl.acm.org

The progress of recommender systems is hampered mainly by evaluation as it requires real-
time interactions between humans and systems, which is too laborious and expensive. This …

被引用次数：115 相关文章所有 8 个版本

[PDF] arxiv.org

Causal embeddings for recommendation

S Bonner, F Vasile - Proceedings of the 12th ACM conference on …, 2018 - dl.acm.org

Many current applications use recommendations in order to modify the natural user
behavior, such as to increase the number of sales or the time spent on a website. This …

被引用次数：300 相关文章所有 11 个版本

[PDF] arxiv.org

Recogym: A reinforcement learning environment for the problem of product recommendation in online advertising

D Rohde, S Bonner, T Dunlop, F Vasile… - arXiv preprint arXiv …, 2018 - arxiv.org

Recommender Systems are becoming ubiquitous in many settings and take many forms,
from product recommendation in e-commerce stores, to query suggestions in search …

被引用次数：178 相关文章所有 4 个版本

[PDF] jmlr.org

Generalization bounds and representation learning for estimation of potential outcomes and causal effects

FD Johansson, U Shalit, N Kallus, D Sontag - Journal of Machine Learning …, 2022 - jmlr.org

Practitioners in diverse fields such as healthcare, economics and education are eager to
apply machine learning to improve decision making. The cost and impracticality of …

被引用次数：122 相关文章所有 4 个版本

[PDF] arxiv.org

Open bandit dataset and pipeline: Towards realistic and reproducible off-policy evaluation

Y Saito, S Aihara, M Matsutani, Y Narita - arXiv preprint arXiv:2008.07146, 2020 - arxiv.org

Off-policy evaluation (OPE) aims to estimate the performance of hypothetical policies using
data generated by a different policy. Because of its huge potential impact in practice, there …

被引用次数：83 相关文章所有 6 个版本

[PDF] researchgate.net

Pessimistic reward models for off-policy learning in recommendation

O Jeunen, B Goethals - Proceedings of the 15th ACM Conference on …, 2021 - dl.acm.org

Methods for bandit learning from user interactions often require a model of the reward a
certain context-action pair will yield–for example, the probability of a click on a …

被引用次数：51 相关文章所有 4 个版本

[PDF] ntu.edu.tw

Improving ad click prediction by considering non-displayed events

B Yuan, JY Hsia, MY Yang, H Zhu, CY Chang… - Proceedings of the 28th …, 2019 - dl.acm.org

Click-through rate (CTR) prediction is the core problem of building advertising systems. Most
existing state-of-the-art approaches model CTR prediction as binary classification problems …

被引用次数：93 相关文章所有 6 个版本

[PDF] arxiv.org

Unbiased learning for the causal effect of recommendation

M Sato, S Takemori, J Singh, T Ohkuma - … of the 14th ACM conference on …, 2020 - dl.acm.org

Increasing users' positive interactions, such as purchases or clicks, is an important objective
of recommender systems. Recommenders typically aim to select items that users will interact …

被引用次数：72 相关文章所有 4 个版本

[PDF] arxiv.org

The music streaming sessions dataset

B Brost, R Mehrotra, T Jehan - The World Wide Web Conference, 2019 - dl.acm.org

At the core of many important machine learning problems faced by online streaming
services is a need to model how users interact with the content they are served …

被引用次数：105 相关文章所有 6 个版本

[PDF] arxiv.org

On the factory floor: ML engineering for industrial-scale ads recommendation models

R Anil, S Gadanho, D Huang, N Jacob, Z Li… - arXiv preprint arXiv …, 2022 - arxiv.org

For industrial-scale advertising systems, prediction of ad click-through rate (CTR) is a central
problem. Ad clicks constitute a significant class of user engagements and are often used as …

被引用次数：29 相关文章所有 3 个版本