The causal structure and computational value of narratives
J Chen, AM Bornstein - Trends in Cognitive Sciences, 2024 - cell.com
Many human behavioral and brain imaging studies have used narratively structured stimuli
(eg, written, audio, or audiovisual stories) to better emulate real-world experience in the …
(eg, written, audio, or audiovisual stories) to better emulate real-world experience in the …
Dual credit assignment processes underlie dopamine signals in a complex spatial environment
Animals frequently make decisions based on expectations of future reward (" values").
Values are updated by ongoing experience: places and choices that result in reward are …
Values are updated by ongoing experience: places and choices that result in reward are …
Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis
A Meulemans, S Schug… - Advances in Neural …, 2024 - proceedings.neurips.cc
To make reinforcement learning more sample efficient, we need better credit assignment
methods that measure an action's influence on future rewards. Building upon Hindsight …
methods that measure an action's influence on future rewards. Building upon Hindsight …
Maximum state entropy exploration using predecessor and successor representations
Animals have a developed ability to explore that aids them in important tasks such as
locating food, exploring for shelter, and finding misplaced items. These exploration skills …
locating food, exploring for shelter, and finding misplaced items. These exploration skills …
A survey of temporal credit assignment in deep reinforcement learning
The Credit Assignment Problem (CAP) refers to the longstanding challenge of
Reinforcement Learning (RL) agents to associate actions with their long-term …
Reinforcement Learning (RL) agents to associate actions with their long-term …
The pitfalls of regularization in off-policy TD learning
G Manek, JZ Kolter - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Temporal Difference (TD) learning is ubiquitous in reinforcement learning, where it is often
combined with off-policy sampling and function approximation. Unfortunately learning with …
combined with off-policy sampling and function approximation. Unfortunately learning with …
Forethought and hindsight in credit assignment
We address the problem of credit assignment in reinforcement learning and explore
fundamental questions regarding the way in which an agent can best use additional …
fundamental questions regarding the way in which an agent can best use additional …
Optimal control of nonlinear system based on deterministic policy gradient with eligibility traces
Optimal control of nonlinear systems by using adaptive dynamic programming (ADP)
methods is always a hot topic in recent years. However, unknown nonlinear systems with …
methods is always a hot topic in recent years. However, unknown nonlinear systems with …
An information-theoretic perspective on credit assignment in reinforcement learning
How do we formalize the challenge of credit assignment in reinforcement learning?
Common intuition would draw attention to reward sparsity as a key contributor to difficult …
Common intuition would draw attention to reward sparsity as a key contributor to difficult …
Learning expected emphatic traces for deep RL
Off-policy sampling and experience replay are key for improving sample efficiency and
scaling model-free temporal difference learning methods. When combined with function …
scaling model-free temporal difference learning methods. When combined with function …