Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management
Deep reinforcement learning (RL) methods have significant potential for dialogue policy
optimisation. However, they suffer from a poor performance in the early stages of learning …
optimisation. However, they suffer from a poor performance in the early stages of learning …
Sample efficient deep reinforcement learning for dialogue systems with large action spaces
In spoken dialogue systems, we aim to deploy artificial intelligence to build automated
dialogue agents that can converse with humans. A part of this effort is the policy optimization …
dialogue agents that can converse with humans. A part of this effort is the policy optimization …
[PDF][PDF] Flow: Deep reinforcement learning for control in sumo
We detail the motivation and design decisions underpinning Flow, a computational
framework integrating SUMO with the deep reinforcement learning libraries rllab and RLlib …
framework integrating SUMO with the deep reinforcement learning libraries rllab and RLlib …
Generalized off-policy actor-critic
We propose a new objective, the counterfactual objective, unifying existing objectives for off-
policy policy gradient algorithms in the continuing reinforcement learning (RL) setting …
policy policy gradient algorithms in the continuing reinforcement learning (RL) setting …
Dynamic planning in open-ended dialogue using reinforcement learning
Despite recent advances in natural language understanding and generation, and decades
of research on the development of conversational bots, building automated agents that can …
of research on the development of conversational bots, building automated agents that can …
A mixture-of-expert approach to rl-based dialogue management
Despite recent advancements in language models (LMs), their application to dialogue
management (DM) problems and ability to carry on rich conversations remain a challenge …
management (DM) problems and ability to carry on rich conversations remain a challenge …
Indoor scene change captioning based on multimodality data
This study proposes a framework for describing a scene change using natural language text
based on indoor scene observations conducted before and after a scene change. The …
based on indoor scene observations conducted before and after a scene change. The …
Reduced robust random cut forest for out-of-distribution detection in machine learning models
H Vardhan, J Sztipanovits - arXiv preprint arXiv:2206.09247, 2022 - arxiv.org
Most machine learning-based regressors extract information from data collected via past
observations of limited length to make predictions in the future. Consequently, when input to …
observations of limited length to make predictions in the future. Consequently, when input to …
[PDF][PDF] Mean actor critic
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state
reinforcement learning. MAC is a policy gradient algorithm that uses the agent's explicit …
reinforcement learning. MAC is a policy gradient algorithm that uses the agent's explicit …
Towards solving text-based games by producing adaptive action spaces
To solve a text-based game, an agent needs to formulate valid text commands for a given
context and find the ones that lead to success. Recent attempts at solving text-based games …
context and find the ones that lead to success. Recent attempts at solving text-based games …