Accelerating exploration with unlabeled prior data
Learning to solve tasks from a sparse reward signal is a major challenge for standard
reinforcement learning (RL) algorithms. However, in the real world, agents rarely need to …
reinforcement learning (RL) algorithms. However, in the real world, agents rarely need to …
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
We study the problem of online sequential decision-making given auxiliary demonstrations
from experts who made their decisions based on unobserved contextual information. These …
from experts who made their decisions based on unobserved contextual information. These …