Deep generative models for offline policy learning: Tutorial, survey, and perspectives on future directions

J Chen, B Ganguly, Y Xu, Y Mei, T Lan… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep generative models (DGMs) have demonstrated great success across various domains,
particularly in generating texts, images, and videos using models trained from offline data …

Hierarchical adversarial inverse reinforcement learning

J Chen, T Lan, V Aggarwal - IEEE Transactions on Neural …, 2023 - ieeexplore.ieee.org
Imitation learning (IL) has been proposed to recover the expert policy from demonstrations.
However, it would be difficult to learn a single monolithic policy for highly complex long …

Two-tiered online optimization of region-wide datacenter resource allocation via deep reinforcement learning

CL Chen, H Zhou, J Chen, M Pedramfar… - arXiv e …, 2023 - ui.adsabs.harvard.edu
This paper addresses the important need for advanced techniques in continuously
allocating workloads on shared infrastructures in data centers, a problem arising due to the …

Cloud-Based Hierarchical Imitation Learning for Scalable Transfer of Construction Skills from Human Workers to Assisting Robots

H Yu, VR Kamat, CC Menassa - Journal of Computing in Civil …, 2024 - ascelibrary.org
Assigning repetitive and physically demanding construction tasks to robots can alleviate
human workers' exposure to occupational injuries, which often result in significant downtime …

Learning-Based Two-Tiered Online Optimization of Region-Wide Datacenter Resource Allocation

CL Chen, H Zhou, J Chen, M Pedramfar… - … on Network and …, 2024 - ieeexplore.ieee.org
Online optimization of resource management for large-scale data centers and infrastructures
to meet dynamic capacity reservation demands and various practical constraints (eg …

A unified algorithm framework for unsupervised discovery of skills based on determinantal point process

J Chen, V Aggarwal, T Lan - Advances in Neural …, 2024 - proceedings.neurips.cc
Learning rich skills under the option framework without supervision of external rewards is at
the frontier of reinforcement learning research. Existing works mainly fall into two distinctive …

Adaptive generative adversarial maximum entropy inverse reinforcement learning

L Song, D Li, X Xu - Information Sciences, 2024 - Elsevier
Maximum entropy inverse reinforcement learning algorithms have been extensively studied
for learning rewards and optimizing policies using expert demonstrations. However, high …

Identifying Selections for Unsupervised Subtask Discovery

Y Qiu, Y Zheng, K Zhang - arXiv preprint arXiv:2410.21616, 2024 - arxiv.org
When solving long-horizon tasks, it is intriguing to decompose the high-level task into
subtasks. Decomposing experiences into reusable subtasks can improve data efficiency …

Computational Teaching for Driving via Multi-Task Imitation Learning

D Gopinath, X Cui, J DeCastro, E Sumner… - arXiv preprint arXiv …, 2024 - arxiv.org
Learning motor skills for sports or performance driving is often done with professional
instruction from expert human teachers, whose availability is limited. Our goal is to enable …

On the benefits of pixel-based hierarchical policies for task generalization

T Cristea-Platon, B Mazoure, J Susskind… - arXiv preprint arXiv …, 2024 - arxiv.org
Reinforcement learning practitioners often avoid hierarchical policies, especially in image-
based observation spaces. Typically, the single-task performance improvement over flat …