Harnessing structures for value-based planning and reinforcement learning

A Mohan, A Zhang, M Lindauer - arXiv preprint arXiv:2306.16021, 2023 - academia.edu

Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …

被引用次数：10 相关文章所有 2 个版本

[PDF] neurips.cc

Sample efficient reinforcement learning via low-rank matrix estimation

D Shah, D Song, Z Xu, Y Yang - Advances in Neural …, 2020 - proceedings.neurips.cc

We consider the question of learning $ Q $-function in a sample efficient manner for
reinforcement learning with continuous state and action spaces under a generative model. If …

被引用次数：42 相关文章所有 6 个版本

[PDF] arxiv.org

Non-asymptotic analysis of monte carlo tree search

D Shah, Q Xie, Z Xu - Abstracts of the 2020 SIGMETRICS/Performance …, 2020 - dl.acm.org

In this work, we consider the popular tree-based search strategy within the framework of
reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite …

被引用次数：44 相关文章所有 12 个版本

[PDF] acm.org

Overcoming the long horizon barrier for sample-efficient reinforcement learning with latent low-rank structure

T Sam, Y Chen, CL Yu - Proceedings of the ACM on Measurement and …, 2023 - dl.acm.org

The practicality of reinforcement learning algorithms has been limited due to poor scaling
with respect to the problem size, as the sample complexity of learning an ε-optimal policy is …

被引用次数：13 相关文章所有 6 个版本

[PDF] arxiv.org

Conditional imitation learning for multi-agent games

A Shih, S Ermon, D Sadigh - 2022 17th ACM/IEEE International …, 2022 - ieeexplore.ieee.org

While advances in multi-agent learning have enabled the training of increasingly complex
agents, most existing techniques produce a final policy that is not designed to adapt to a …

被引用次数：11 相关文章所有 7 个版本

[PDF] openreview.net

Curvature explains loss of plasticity

A Lewandowski, H Tanaka, D Schuurmans… - 2023 - openreview.net

Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from
new experience. Despite being empirically observed in several problem settings, little is …

被引用次数：5 相关文章所有 3 个版本

[PDF] jair.org Full View

Structure in Deep Reinforcement Learning: A Survey and Open Problems

A Mohan, A Zhang, M Lindauer - Journal of Artificial Intelligence Research, 2024 - jair.org

Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …

Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence

M Hussing, C Voelcker, I Gilitschenski… - arXiv preprint arXiv …, 2024 - arxiv.org

We show that deep reinforcement learning can maintain its ability to learn without resetting
network parameters in settings where the number of gradient updates greatly exceeds the …

被引用次数：4 相关文章所有 2 个版本

[PDF] mdpi.com

Automatic generation of meta-path graph for concept recommendation in moocs

J Gong, C Wang, Z Zhao, X Zhang - Electronics, 2021 - mdpi.com

In MOOCs, generally speaking, curriculum designing, course selection, and knowledge
concept recommendation are the three major steps that systematically instruct users to learn …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

Tensor and matrix low-rank value-function approximation in reinforcement learning

S Rozada, S Paternain… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Value function (VF) approximation is a central problem in reinforcement learning (RL).
Classical non-parametric VF estimation suffers from the curse of dimensionality. As a result …

被引用次数：6 相关文章所有 4 个版本