Harnessing density ratios for online reinforcement learning

P Amortila, DJ Foster, N Jiang, A Sekhari… - arXiv preprint arXiv …, 2024 - arxiv.org
The theories of offline and online reinforcement learning, despite having evolved in parallel,
have begun to show signs of the possibility for a unification, with algorithms and analysis …

Exploration is harder than prediction: Cryptographically separating reinforcement learning from supervised learning

N Golowich, A Moitra, D Rohatgi - arXiv preprint arXiv:2404.03774, 2024 - arxiv.org
Supervised learning is often computationally easy in practice. But to what extent does this
mean that other modes of learning, such as reinforcement learning (RL), ought to be …

Scalable Online Exploration via Coverability

P Amortila, DJ Foster, A Krishnamurthy - arXiv preprint arXiv:2403.06571, 2024 - arxiv.org
Exploration is a major challenge in reinforcement learning, especially for high-dimensional
domains that require function approximation. We propose exploration objectives--policy …

Efficiently Learning Markov Random Fields from Dynamics

J Gaitonde, A Moitra, E Mossel - arXiv preprint arXiv:2409.05284, 2024 - arxiv.org
An important task in high-dimensional statistics is learning the parameters or dependency
structure of an undirected graphical model, or Markov random field (MRF). Much of the prior …

On Learning Parities with Dependent Noise

N Golowich, A Moitra, D Rohatgi - arXiv preprint arXiv:2404.11325, 2024 - arxiv.org
In this expository note we show that the learning parities with noise (LPN) assumption is
robust to weak dependencies in the noise distribution of small batches of samples. This …