Simultaneous clustering and estimation of heterogeneous graphical models B Hao, WW Sun, Y Liu, G Cheng Journal of Machine Learning Research 18 (217), 1-58, 2018 | 77 | 2018 |
Adaptive exploration in linear contextual bandit B Hao, T Lattimore, C Szepesvari International Conference on Artificial Intelligence and Statistics, 3536-3545, 2020 | 67 | 2020 |
Bootstrapping upper confidence bound B Hao, Y Abbasi-Yadkori, Z Wen, G Cheng 33rd Conference on Neural Information Processing Systems, 2019 | 62 | 2019 |
Sparse and low-rank tensor estimation via cubic sketchings B Hao, AR Zhang, G Cheng International conference on artificial intelligence and statistics, 1319-1330, 2020 | 61 | 2020 |
High-dimensional sparse linear bandits B Hao, T Lattimore, M Wang 34th Conference on Neural Information Processing Systems, 2020 | 60 | 2020 |
Bootstrapping fitted q-evaluation for off-policy inference B Hao, X Ji, Y Duan, H Lu, C Szepesvari, M Wang International Conference on Machine Learning, 4074-4084, 2021 | 39 | 2021 |
Sparse feature selection makes batch reinforcement learning more sample efficient B Hao, Y Duan, T Lattimore, C Szepesvári, M Wang International Conference on Machine Learning, 4063-4073, 2021 | 36 | 2021 |
Online sparse reinforcement learning B Hao, T Lattimore, C Szepesvári, M Wang International Conference on Artificial Intelligence and Statistics, 316-324, 2021 | 30 | 2021 |
Sparse tensor additive regression B Hao, B Wang, P Wang, J Zhang, J Yang, WW Sun Journal of machine learning research 22 (64), 1-43, 2021 | 29 | 2021 |
Adaptive approximate policy iteration B Hao, N Lazic, Y Abbasi-Yadkori, P Joulani, C Szepesvari Proceedings of the 24th International Conference on Artificial Intelligence …, 2020 | 28* | 2020 |
Efficient local planning with linear function approximation D Yin, B Hao, Y Abbasi-Yadkori, N Lazić, C Szepesvári International Conference on Algorithmic Learning Theory, 1165-1192, 2022 | 25 | 2022 |
Residual bootstrap exploration for bandit algorithms CH Wang, Y Yu, B Hao, G Cheng arXiv preprint arXiv:2002.08436, 2020 | 20 | 2020 |
Regret Bounds for Information-Directed Reinforcement Learning B Hao, T Lattimore Advances in Neural Information Processing Systems, 2022 | 19 | 2022 |
Information directed sampling for sparse linear bandits B Hao, T Lattimore, W Deng Advances in Neural Information Processing Systems 34, 16738-16750, 2021 | 19 | 2021 |
The neural testbed: Evaluating joint predictions I Osband, Z Wen, SM Asghari, V Dwaracherla, X Lu, M Ibrahimi, ... Advances in Neural Information Processing Systems 35, 12554-12565, 2022 | 18 | 2022 |
Contextual information-directed sampling B Hao, T Lattimore, C Qin International Conference on Machine Learning, 8446-8464, 2022 | 16 | 2022 |
Bootstrapping Statistical Inference for Off-Policy Evaluation B Hao, X Ji, Y Duan, H Lu, C Szepesvári, M Wang arXiv preprint arXiv:2102.03607, 2021 | 16 | 2021 |
Interacting Contour Stochastic Gradient Langevin Dynamics W Deng, S Liang, B Hao, G Lin, F Liang The Tenth International Conference on Learning Representations, 2022 | 13 | 2022 |
Low-rank tensor bandits B Hao, J Zhou, Z Wen, WW Sun arXiv e-prints, arXiv: 2007.15788, 2020 | 12 | 2020 |
Bandit phase retrieval T Lattimore, B Hao Advances in Neural Information Processing Systems 34, 18801-18811, 2021 | 11 | 2021 |