[PDF][PDF] A reinforcement learning-informed pattern mining framework for multivariate time series classification

G Gao, Q Gao, X Yang, M Pajic, M Chi - 31st International Joint …, 2022 - par.nsf.gov
Multivariate time series (MTS) classification is a challenging and important task in various
domains and real-world applications. Much of prior work on MTS can be roughly divided into …

Off-policy evaluation for human feedback

Q Gao, G Gao, J Dong, V Tarokh… - Advances in Neural …, 2024 - proceedings.neurips.cc
Off-policy evaluation (OPE) is important for closing the gap between offline training and
evaluation of reinforcement learning (RL), by estimating performance and/or rank of target …

Hope: Human-centric off-policy evaluation for e-learning and healthcare

G Gao, S Ju, MS Ausin, M Chi - arXiv preprint arXiv:2302.09212, 2023 - arxiv.org
Reinforcement learning (RL) has been extensively researched for enhancing human-
environment interactions in various human-centric tasks, including e-learning and …

Offline learning of closed-loop deep brain stimulation controllers for parkinson disease treatment

Q Gao, SL Schmidt, A Chowdhury, G Feng… - Proceedings of the …, 2023 - dl.acm.org
Deep brain stimulation (DBS) has shown great promise toward treating motor symptoms
caused by Parkinson's disease (PD), by delivering electrical pulses to the Basal Ganglia …

Robust reinforcement learning through efficient adversarial herding

J Dong, HL Hsu, Q Gao, V Tarokh, M Pajic - arXiv preprint arXiv …, 2023 - arxiv.org
Although reinforcement learning (RL) is considered the gold standard for policy design, it
may not always provide a robust solution in various scenarios. This can result in severe …

Reconstructing missing ehrs using time-aware within-and cross-visit information for septic shock early prediction

G Gao, F Khoshnevisan, M Chi - 2022 IEEE 10th International …, 2022 - ieeexplore.ieee.org
Real-world Electronic Health Records (EHRs) are often plagued by a high rate of missing
data. In our EHRs, for example, the missing rates can be as high as 90% for some features …

Variational Latent Branching Model for Off-Policy Evaluation

Q Gao, G Gao, M Chi, M Pajic - arXiv preprint arXiv:2301.12056, 2023 - arxiv.org
Model-based methods have recently shown great potential for off-policy evaluation (OPE);
offline trajectories induced by behavioral policies are fitted to transitions of Markov decision …

Robust exploration with adversary via Langevin Monte Carlo

HL Hsu, M Pajic - 6th Annual Learning for Dynamics & …, 2024 - proceedings.mlr.press
In the realm of Deep Q-Networks (DQNs), numerous exploration strategies have
demonstrated efficacy within controlled environments. However, these methods encounter …

[PDF][PDF] Learning for Control and Decision Making toward Medical Autonomy

Q Gao - 2024 - dukespace.lib.duke.edu
Artificial intelligence (AI) and deep learning (DL) have recently shown success in domains
related to healthcare and its decision-making systems. However, most of the existing …

Robust Reinforcement Learning with Structured Adversarial Ensemble

J Dong, HL Hsu, Q Gao, V Tarokh, M Pajic - openreview.net
Although reinforcement learning (RL) is considered the gold standard for policy design, it
may not always provide a robust solution in various scenarios. This can result in severe …