Reinforcement learning based recommender systems: A survey

MM Afsar, T Crump, B Far - ACM Computing Surveys, 2022 - dl.acm.org
Recommender systems (RSs) have become an inseparable part of our everyday lives. They
help us find our favorite items to purchase, our friends on social networks, and our favorite …

[PDF][PDF] A comprehensive survey on safe reinforcement learning

J Garcıa, F Fernández - Journal of Machine Learning Research, 2015 - jmlr.org
Abstract Safe Reinforcement Learning can be defined as the process of learning policies
that maximize the expectation of the return in problems in which it is important to ensure …

Deep learning in robotics: a review of recent research

HA Pierson, MS Gashler - Advanced Robotics, 2017 - Taylor & Francis
Advances in deep learning over the last decade have led to a flurry of research in the
application of deep artificial neural networks to robotic systems, with at least 30 papers …

A survey of robot learning from demonstration

BD Argall, S Chernova, M Veloso… - Robotics and autonomous …, 2009 - Elsevier
We present a comprehensive survey of robot Learning from Demonstration (LfD), a
technique that develops policies from example state to action mappings. We introduce the …

[图书][B] Robot learning from human teachers

S Chernova, AL Thomaz - 2014 - books.google.com
Learning from Demonstration (LfD) explores techniques for learning a task policy from
examples provided by a human teacher. The field of LfD has grown into an extensive body …

Introduction to spiking neural networks: Information processing, learning and applications

F Ponulak, A Kasinski - Acta neurobiologiae experimentalis, 2011 - ane.pl
The concept that neural information is encoded in the firing rate of neurons has been the
dominant paradigm in neurobiology for many years. This paradigm has also been adopted …

[图书][B] Intelligent automatic generation control

H Bevrani, T Hiyama - 2011 - api.taylorfrancis.com
Automatic generation control (AGC) is one of the important control problems in
interconnected power system design and operation, and is becoming more significant today …

Subsecond dopamine fluctuations in human striatum encode superposed error signals about actual and counterfactual reward

KT Kishida, I Saez, T Lohrenz… - Proceedings of the …, 2016 - National Acad Sciences
In the mammalian brain, dopamine is a critical neuromodulator whose actions underlie
learning, decision-making, and behavioral control. Degeneration of dopamine neurons …

Reinforcement learning for optimal control of low exergy buildings

L Yang, Z Nagy, P Goffin, A Schlueter - Applied Energy, 2015 - Elsevier
Over a third of the anthropogenic greenhouse gas (GHG) emissions stem from cooling and
heating buildings, due to their fossil fuel based operation. Low exergy building systems are …

Imaging valuation models in human choice

PR Montague, B King-Casas, JD Cohen - Annu. Rev. Neurosci., 2006 - annualreviews.org
To make a decision, a system must assign value to each of its available choices. In the
human brain, one approach to studying valuation has used rewarding stimuli to map out …