关注
Runzhe Wan
Runzhe Wan
在 amazon.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Batch policy learning in average reward markov decision processes
P Liao, Z Qi, R Wan, P Klasnja, S Murphy
arXiv preprint arXiv:2007.11771, 2022
762022
Deeply-debiased off-policy interval estimation
C Shi, R Wan, V Chernozhukov, R Song
International Conference on Machine Learning, 9580-9591, 2021
362021
Does the Markov decision process fit the data: testing for the Markov property in sequential decision making
C Shi, R Wan, R Song, W Lu, L Leng
International Conference on Machine Learning, 8807-8817, 2020
342020
Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control
R Wan, X Zhang, R Song
Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021
312021
Metadata-based multi-task bandits with bayesian hierarchical models
R Wan, L Ge, R Song
Advances in Neural Information Processing Systems 34, 29655-29668, 2021
252021
Safe Exploration for Efficient Policy Evaluation and Comparison
R Wan, B Kveton, R Song
International Conference on Machine Learning, 22491-22511, 2022
122022
Towards scalable and robust structured bandits: A meta-learning framework
R Wan, L Ge, R Song
International Conference on Artificial Intelligence and Statistics, 1144-1173, 2023
102023
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C Shi, R Wan, G Song, S Luo, R Song, H Zhu
arXiv preprint arXiv:2202.10574, 2022
102022
A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets
C Shi, R Wan, G Song, S Luo, H Zhu, R Song
The Annals of Applied Statistics 17 (4), 2701-2722, 2023
32023
STEEL: Singularity-aware Reinforcement Learning
X Chen, Z Qi, R Wan
arXiv preprint arXiv:2301.13152, 2023
32023
Pattern Transfer Learning for Reinforcement Learning in Order Dispatching
R Wan, S Zhang, C Shi, S Luo, R Song
arXiv preprint arXiv:2105.13218, 2021
32021
Robust Offline Policy Evaluation and Optimization with Heavy-Tailed Rewards
J Zhu, R Wan, Z Qi, S Luo, C Shi
arXiv preprint arXiv:2310.18715, 2023
22023
Multiplier bootstrap-based exploration
R Wan, H Wei, B Kveton, R Song
International Conference on Machine Learning, 35444-35490, 2023
22023
Experimentation platforms meet reinforcement learning: Bayesian sequential decision-making for continuous monitoring
R Wan, Y Liu, J McQueen, D Hains, R Song
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023
12023
Mining the factor zoo: Estimation of latent factor models with sufficient proxies
R Wan, Y Li, W Lu, R Song
Journal of Econometrics, 105386, 2023
12023
Advances in Statistical Inference and Policy Optimization for Reinforcement Learning
R Wan
North Carolina State University, 2022
12022
Online testing efficiency through early termination
Y Liu, R Wan, J McQueen, D Hains, R Song, RH Castillo
US Patent 11,909,829, 2024
2024
Zero-Inflated Bandits
H Wei, R Wan, L Shi, R Song
arXiv preprint arXiv:2312.15595, 2023
2023
Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches
Y Liu, R Wan, J McQueen, D Hains, J Gu, R Song
arXiv preprint arXiv:2312.12871, 2023
2023
Singularity-aware Reinforcement Learning
X Chen, Z Qi, R Wan
arXiv preprint arXiv:2301.13152, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–20