A systematic study on reinforcement learning based applications
K Sivamayil, E Rajasekar, B Aljafari, S Nikolovski… - Energies, 2023 - mdpi.com
We have analyzed 127 publications for this review paper, which discuss applications of
Reinforcement Learning (RL) in marketing, robotics, gaming, automated cars, natural …
Reinforcement Learning (RL) in marketing, robotics, gaming, automated cars, natural …
Reinforcement learning and bandits for speech and language processing: Tutorial, review and outlook
B Lin - Expert Systems with Applications, 2024 - Elsevier
In recent years, reinforcement learning and bandits have transformed a wide range of real-
world applications including healthcare, finance, recommendation systems, robotics, and …
world applications including healthcare, finance, recommendation systems, robotics, and …
Pessimistic reward models for off-policy learning in recommendation
O Jeunen, B Goethals - Proceedings of the 15th ACM Conference on …, 2021 - dl.acm.org
Methods for bandit learning from user interactions often require a model of the reward a
certain context-action pair will yield–for example, the probability of a click on a …
certain context-action pair will yield–for example, the probability of a click on a …
Scalable neural contextual bandit for recommender systems
High-quality recommender systems ought to deliver both innovative and relevant content
through effective and exploratory interactions with users. Yet, supervised learning-based …
through effective and exploratory interactions with users. Yet, supervised learning-based …
Pessimistic decision-making for recommender systems
O Jeunen, B Goethals - ACM Transactions on Recommender Systems, 2023 - dl.acm.org
Modern recommender systems are often modelled under the sequential decision-making
paradigm, where the system decides which recommendations to show in order to maximise …
paradigm, where the system decides which recommendations to show in order to maximise …
Flexible recommendation for optimizing the debt collection process based on customer risk using deep reinforcement learning
K Sivamayilvelan, E Rajasekar… - Expert Systems with …, 2024 - Elsevier
Finance sector loss can be minimized by reducing the number of defaulters who often miss
payments during debt collection. Most research focused on the credit risk analysis before …
payments during debt collection. Most research focused on the credit risk analysis before …
Deep meta-learning in recommendation systems: A survey
Deep neural network based recommendation systems have achieved great success as
information filtering techniques in recent years. However, since model training from scratch …
information filtering techniques in recent years. However, since model training from scratch …
Efficient online bayesian inference for neural bandits
G Duran-Martin, A Kara… - … Conference on Artificial …, 2022 - proceedings.mlr.press
In this paper we present a new algorithm for online (sequential) inference in Bayesian
neural networks, and show its suitability for tackling contextual bandit problems. The key …
neural networks, and show its suitability for tackling contextual bandit problems. The key …
Conservative exploration in reinforcement learning
E Garcelon, M Ghavamzadeh… - International …, 2020 - proceedings.mlr.press
While learning in an unknown Markov Decision Process (MDP), an agent should trade off
exploration to discover new information about the MDP, and exploitation of the current …
exploration to discover new information about the MDP, and exploitation of the current …
Adversarial gradient driven exploration for deep click-through rate prediction
Exploration-Exploitation (E& E) algorithms are commonly adopted to deal with the feedback-
loop issue in large-scale online recommender systems. Most of existing studies believe that …
loop issue in large-scale online recommender systems. Most of existing studies believe that …