Q-learning with ucb exploration is sample efficient for infinite-horizon mdp K Dong, Y Wang, X Chen, L Wang International Conference on Learning Representations, 2019 | 112 | 2019 |
Exploration via hindsight goal generation Z Ren, K Dong, Y Zhou, Q Liu, J Peng Advances in Neural Information Processing Systems 32, 2019 | 85 | 2019 |
Root-n-regret for learning in markov decision processes with function approximation and low bellman rank K Dong, J Peng, Y Wang, Y Zhou Conference on Learning Theory, 1554-1557, 2020 | 46 | 2020 |
Provable model-based nonlinear bandit and reinforcement learning: Shelve optimism, embrace virtual curvature K Dong, J Yang, T Ma Advances in neural information processing systems 34, 26168-26182, 2021 | 40 | 2021 |
On the expressivity of neural networks for deep reinforcement learning K Dong, Y Luo, T Yu, C Finn, T Ma International conference on machine learning, 2627-2637, 2020 | 34 | 2020 |
Design of experiments for stochastic contextual linear bandits A Zanette, K Dong, JN Lee, E Brunskill Advances in Neural Information Processing Systems 34, 22720-22731, 2021 | 23 | 2021 |
Multinomial logit bandit with low switching cost K Dong, Y Li, Q Zhang, Y Zhou International Conference on Machine Learning, 2607-2615, 2020 | 19 | 2020 |
First steps toward understanding the extrapolation of nonlinear models to unseen domains K Dong, T Ma arXiv preprint arXiv:2211.11719, 2022 | 11 | 2022 |
Asymptotic instance-optimal algorithms for interactive decision making K Dong, T Ma arXiv preprint arXiv:2206.02326, 2022 | 9 | 2022 |
Beyond ntk with vanilla gradient descent: A mean-field analysis of neural networks with polynomial width, samples, and time A Mahankali, H Zhang, K Dong, M Glasgow, T Ma Advances in Neural Information Processing Systems 36, 2024 | 8 | 2024 |
Toward L_∞ Recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random Fields K Dong, T Ma The Thirty Sixth Annual Conference on Learning Theory, 2877-2918, 2023 | 2 | 2023 |
Refined analysis of fpl for adversarial markov decision processes Y Wang, K Dong arXiv preprint arXiv:2008.09251, 2020 | 2 | 2020 |
Model-based offline reinforcement learning with local misspecification K Dong, Y Flet-Berliac, A Nie, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 37 (6), 7423-7431, 2023 | 1 | 2023 |