Q-learning with ucb exploration is sample efficient for infinite-horizon mdp K Dong, Y Wang, X Chen, L Wang International Conference on Learning Representations, 2020 | 112 | 2020 |
Distributed bandit learning: Near-optimal regret with efficient communication Y Wang, J Hu, X Chen, L Wang International Conference on Learning Representations, 2020 | 92 | 2020 |
Understanding Domain Randomization for Sim-to-real Transfer X Chen, J Hu, C Jin, L Li, L Wang International Conference on Learning Representations, 2022 | 64 | 2022 |
Near-Optimal Representation Learning for Linear Bandits and Linear RL J Hu, X Chen, C Jin, L Li, L Wang International Conference on Machine Learning, 4349-4358, 2021 | 46 | 2021 |
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation X Chen, H Zhong, Z Yang, Z Wang, L Wang International Conference on Machine Learning, 3773-3793, 2022 | 45 | 2022 |
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL X Chen, J Hu, L Li, L Wang International Conference on Learning Representations, 2021 | 21 | 2021 |
Near-Optimal Reward-Free Exploration for Linear Mixture MDPs with Plug-in Solver X Chen, J Hu, LF Yang, L Wang International Conference on Learning Representations, 2022 | 17 | 2022 |
(Locally) Differentially Private Combinatorial Semi-Bandits X Chen, K Zheng, Z Zhou, Y Yang, W Chen, L Wang International Conference on Machine Learning, 1757-1767, 2020 | 3 | 2020 |
On the power of pre-training for generalization in rl: Provable benefits and hardness H Ye, X Chen, L Wang, SS Du International Conference on Machine Learning, 39770-39800, 2023 | 2 | 2023 |