Batch policy learning in average reward markov decision processes P Liao, Z Qi, R Wan, P Klasnja, S Murphy arXiv preprint arXiv:2007.11771, 2022 | 76 | 2022 |
Deeply-debiased off-policy interval estimation C Shi, R Wan, V Chernozhukov, R Song International Conference on Machine Learning, 9580-9591, 2021 | 36 | 2021 |
Does the Markov decision process fit the data: testing for the Markov property in sequential decision making C Shi, R Wan, R Song, W Lu, L Leng International Conference on Machine Learning, 8807-8817, 2020 | 34 | 2020 |
Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control R Wan, X Zhang, R Song Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021 | 31 | 2021 |
Metadata-based multi-task bandits with bayesian hierarchical models R Wan, L Ge, R Song Advances in Neural Information Processing Systems 34, 29655-29668, 2021 | 25 | 2021 |
Safe Exploration for Efficient Policy Evaluation and Comparison R Wan, B Kveton, R Song International Conference on Machine Learning, 22491-22511, 2022 | 12 | 2022 |
Towards scalable and robust structured bandits: A meta-learning framework R Wan, L Ge, R Song International Conference on Artificial Intelligence and Statistics, 1144-1173, 2023 | 10 | 2023 |
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets C Shi, R Wan, G Song, S Luo, R Song, H Zhu arXiv preprint arXiv:2202.10574, 2022 | 10 | 2022 |
A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets C Shi, R Wan, G Song, S Luo, H Zhu, R Song The Annals of Applied Statistics 17 (4), 2701-2722, 2023 | 3 | 2023 |
STEEL: Singularity-aware Reinforcement Learning X Chen, Z Qi, R Wan arXiv preprint arXiv:2301.13152, 2023 | 3 | 2023 |
Pattern Transfer Learning for Reinforcement Learning in Order Dispatching R Wan, S Zhang, C Shi, S Luo, R Song arXiv preprint arXiv:2105.13218, 2021 | 3 | 2021 |
Robust Offline Policy Evaluation and Optimization with Heavy-Tailed Rewards J Zhu, R Wan, Z Qi, S Luo, C Shi arXiv preprint arXiv:2310.18715, 2023 | 2 | 2023 |
Multiplier bootstrap-based exploration R Wan, H Wei, B Kveton, R Song International Conference on Machine Learning, 35444-35490, 2023 | 2 | 2023 |
Experimentation platforms meet reinforcement learning: Bayesian sequential decision-making for continuous monitoring R Wan, Y Liu, J McQueen, D Hains, R Song Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023 | 1 | 2023 |
Mining the factor zoo: Estimation of latent factor models with sufficient proxies R Wan, Y Li, W Lu, R Song Journal of Econometrics, 105386, 2023 | 1 | 2023 |
Advances in Statistical Inference and Policy Optimization for Reinforcement Learning R Wan North Carolina State University, 2022 | 1 | 2022 |
Online testing efficiency through early termination Y Liu, R Wan, J McQueen, D Hains, R Song, RH Castillo US Patent 11,909,829, 2024 | | 2024 |
Zero-Inflated Bandits H Wei, R Wan, L Shi, R Song arXiv preprint arXiv:2312.15595, 2023 | | 2023 |
Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches Y Liu, R Wan, J McQueen, D Hains, J Gu, R Song arXiv preprint arXiv:2312.12871, 2023 | | 2023 |
Singularity-aware Reinforcement Learning X Chen, Z Qi, R Wan arXiv preprint arXiv:2301.13152, 2023 | | 2023 |