Content-driven music recommendation: Evolution, state of the art, and challenges

Y Deldjoo, M Schedl, P Knees - Computer Science Review, 2024 - Elsevier
The music domain is among the most important ones for adopting recommender systems
technology. In contrast to most other recommendation domains, which predominantly rely on …

Is rlhf more difficult than standard rl? a theoretical perspective

Y Wang, Q Liu, C Jin - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Abstract Reinforcement learning from Human Feedback (RLHF) learns from preference
signals, while standard Reinforcement Learning (RL) directly learns from reward signals …

Positive, negative and neutral: Modeling implicit feedback in session-based news recommendation

S Gong, KQ Zhu - Proceedings of the 45th international ACM SIGIR …, 2022 - dl.acm.org
News recommendation for anonymous readers is a useful but challenging task for many
news portals, where interactions between readers and articles are limited within a temporary …

Preference-based online learning with dueling bandits: A survey

V Bengs, R Busa-Fekete, A El Mesaoudi-Paul… - Journal of Machine …, 2021 - jmlr.org
In machine learning, the notion of multi-armed bandits refers to a class of online learning
problems, in which an agent is supposed to simultaneously explore and exploit a given set …

Arithmetic control of llms for diverse user preferences: Directional preference alignment with multi-objective rewards

H Wang, Y Lin, W Xiong, R Yang, S Diao, S Qiu… - arXiv preprint arXiv …, 2024 - arxiv.org
Fine-grained control over large language models (LLMs) remains a significant challenge,
hindering their adaptability to diverse user needs. While Reinforcement Learning from …

Is RLHF More Difficult than Standard RL?

Y Wang, Q Liu, C Jin - arXiv preprint arXiv:2306.14111, 2023 - arxiv.org
Reinforcement learning from Human Feedback (RLHF) learns from preference signals,
while standard Reinforcement Learning (RL) directly learns from reward signals …

Carousel personalization in music streaming apps with contextual bandits

W Bendada, G Salha, T Bontempelli - … of the 14th ACM Conference on …, 2020 - dl.acm.org
Media services providers, such as music streaming platforms, frequently leverage swipeable
carousels to recommend personalized content to their users. However, selecting the most …

Counteracting user attention bias in music streaming recommendation via reward modification

X Zhang, S Dai, J Xu, Z Dong, Q Dai… - Proceedings of the 28th …, 2022 - dl.acm.org
In streaming media applications, like music Apps, songs are recommended in a continuous
way in users' daily life. The recommended songs are played automatically although users …

Discover: Disentangled music representation learning for cover song identification

J Xun, S Zhang, Y Yang, J Zhu, L Deng… - Proceedings of the 46th …, 2023 - dl.acm.org
In the field of music information retrieval (MIR), cover song identification (CSI) is a
challenging task that aims to identify cover versions of a query song from a massive …

Building cross-sectional systematic strategies by learning to rank

D Poh, B Lim, S Zohren, S Roberts - arXiv preprint arXiv:2012.07149, 2020 - arxiv.org
The success of a cross-sectional systematic strategy depends critically on accurately ranking
assets prior to portfolio construction. Contemporary techniques perform this ranking step …