Sample-optimal parametric q-learning using linearly additive features L Yang, M Wang International conference on machine learning, 6995-7004, 2019 | 348 | 2019 |
Model-based reinforcement learning with value-targeted regression A Ayoub, Z Jia, C Szepesvari, M Wang, L Yang International Conference on Machine Learning, 463-474, 2020 | 312 | 2020 |
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound LF Yang, M Wang International Conference on Machine Learning, 2020, 2019 | 308 | 2019 |
Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions M Wang, EX Fang, H Liu Mathematical Programming 161, 419-449, 2017 | 265 | 2017 |
Near-optimal time and sample complexities for solving Markov decision processes with a generative model A Sidford, M Wang, X Wu, L Yang, Y Ye Advances in Neural Information Processing Systems 31, 2018 | 256* | 2018 |
Approximation methods for bilevel programming S Ghadimi, M Wang arXiv preprint arXiv:1802.02246, 2018 | 218 | 2018 |
Minimax-optimal off-policy evaluation with linear function approximation Y Duan, Z Jia, M Wang International Conference on Machine Learning, 2701-2709, 2020 | 160 | 2020 |
Accelerating stochastic composition optimization M Wang, J Liu, EX Fang Journal of Machine Learning Research, 2017, 2016 | 152 | 2016 |
Variance reduced value iteration and faster algorithms for solving markov decision processes A Sidford, M Wang, X Wu, Y Ye. Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete …, 2017 | 140* | 2017 |
Variational policy gradient method for reinforcement learning with general utilities J Zhang, A Koppel, AS Bedi, C Szepesvari, M Wang Advances in Neural Information Processing Systems 2020, 2020 | 132 | 2020 |
A single timescale stochastic approximation method for nested stochastic optimization S Ghadimi, A Ruszczynski, M Wang SIAM Journal on Optimization 30 (1), 960-979, 2020 | 122 | 2020 |
Stochastic first-order methods with random constraint projection M Wang, DP Bertsekas SIAM Journal on Optimization 26 (1), 681-717, 2016 | 118* | 2016 |
On function approximation in reinforcement learning: Optimism in the face of large state spaces Z Yang, C Jin, Z Wang, M Wang, MI Jordan arXiv preprint arXiv:2011.04622, 2020 | 97* | 2020 |
Finite-sum composition optimization via variance reduced gradient descent X Lian, M Wang, J Liu Artificial Intelligence and Statistics. 2017., 2016 | 94 | 2016 |
Towards compact cnns via collaborative compression Y Li, S Lin, J Liu, Q Ye, M Wang, F Chao, F Yang, J Ma, Q Tian, R Ji Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 90 | 2021 |
Visual adversarial examples jailbreak aligned large language models X Qi, K Huang, A Panda, P Henderson, M Wang, P Mittal Proceedings of the AAAI Conference on Artificial Intelligence 38 (19), 21527 …, 2024 | 85* | 2024 |
Randomized linear programming solves the Markov decision problem in nearly linear (sometimes sublinear) time M Wang Mathematics of Operations Research 45 (2), 517-546, 2020 | 80* | 2020 |
A distributed tracking algorithm for reconstruction of graph signals X Wang, M Wang, Y Gu IEEE Journal of Selected Topics in Signal Processing 9 (4), 728-740, 2015 | 80 | 2015 |
Solving discounted stochastic two-player games with near-optimal time and sample complexity A Sidford, M Wang, L Yang, Y Ye International Conference on Artificial Intelligence and Statistics, 2992-3002, 2020 | 79 | 2020 |
Score approximation, estimation and distribution recovery of diffusion models on low-dimensional data M Chen, K Huang, T Zhao, M Wang International Conference on Machine Learning, 2024 | 75 | 2024 |