Piecewise linear activations substantially shape the loss surfaces of neural networks F He*, B Wang*, D Tao International Conference on Learning Representations (ICLR) 2020, 2020 | 30 | 2020 |
The implicit bias for adaptive optimization algorithms on homogeneous neural networks B Wang, Q Meng, W Chen, TY Liu International Conference on Machine Learning, 10849-10858, 2021 | 28 | 2021 |
Machine-learning nonconservative dynamics for new-physics detection Z Liu, B Wang, Q Meng, W Chen, M Tegmark, TY Liu Physical Review E 104 (5), 055302, 2021 | 27 | 2021 |
Provable adaptivity in adam B Wang, Y Zhang, H Zhang, Q Meng, ZM Ma, TY Liu, W Chen arXiv preprint arXiv:2208.09900, 2022 | 21 | 2022 |
Convergence of adagrad for non-convex objectives: Simple proofs and relaxed assumptions B Wang, H Zhang, Z Ma, W Chen The Thirty Sixth Annual Conference on Learning Theory, 161-190, 2023 | 20 | 2023 |
Tighter generalization bounds for iterative differentially private learning algorithms F He*, B Wang*, D Tao Uncertainty in Artificial Intelligence (UAI) 2021, 2021 | 15 | 2021 |
Creating training sets via weak indirect supervision J Zhang, B Wang, X Song, Y Wang, Y Yang, J Bai, A Ratner ICLR 2022, 0 | 15* | |
Does Momentum Change the Implicit Regularization on Separable Data? B Wang, Q Meng, H Zhang, R Sun, W Chen, ZM Ma Neurips 2022, 0 | 11* | |
Robustness, privacy, and generalization of adversarial training F He, S Fu, B Wang, D Tao arXiv preprint arXiv:2012.13573, 2020 | 8 | 2020 |
Closing the gap between the upper bound and lower bound of Adam's iteration complexity B Wang, J Fu, H Zhang, N Zheng, W Chen Advances in Neural Information Processing Systems 36, 2024 | 7 | 2024 |
On the trade-off of intra-/inter-class diversity for supervised pre-training J Zhang, B Wang, Z Hu, PWW Koh, AJ Ratner Advances in Neural Information Processing Systems 36, 2024 | 7 | 2024 |
-GNN: incorporating ring priors into molecular modeling J Zhu, K Wu, B Wang, Y Xia, S Xie, Q Meng, L Wu, T Qin, W Zhou, H Li, ... The Eleventh International Conference on Learning Representations, 2022 | 7 | 2022 |
Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD B Wang, H Zhang, J Zhang, Q Meng, W Chen, TY Liu 35th Conference on Neural Information Processing Systems (Neurips 2021), 2021 | 7 | 2021 |
When and why momentum accelerates SGD: An empirical study J Fu, B Wang, H Zhang, Z Zhang, W Chen, N Zheng arXiv preprint arXiv:2306.09000, 2023 | 3 | 2023 |
Fast Conditional Mixing of MCMC Algorithms for Non-log-concave Distributions X Cheng, B Wang, J Zhang, Y Zhu Advances in Neural Information Processing Systems 36, 2024 | 2 | 2024 |
Towards Understanding the Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space M Yi, B Wang arXiv preprint arXiv:2401.13530, 2024 | | 2024 |
Fast conditional mixing of MCMC algorithms for non-log-concave distributions B Wang, X Cheng, J Zhang, Y Zhu Proceedings of the 37th International Conference on Neural Information …, 2023 | | 2023 |
Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study P Phunyaphibarn, J Lee, B Wang, H Zhang, C Yun arXiv preprint arXiv:2311.15051, 2023 | | 2023 |