Towards theoretically understanding why sgd generalizes better than adam in deep learning P Zhou, J Feng, C Ma, C Xiong, S Hoi Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020 | 254 | 2020 |
How sgd selects the global minima in over-parameterized learning: A dynamical stability perspective L Wu, C Ma Advances in Neural Information Processing Systems 31, 2018 | 217 | 2018 |
The Barron Space and the Flow-Induced Function Spaces for Neural Network Models E Weinan, C Ma, L Wu Constructive Approximation, https://doi.org/10.1007/s00365-021-09549-y, 2021 | 184* | 2021 |
A priori estimates of the population risk for two-layer neural networks C Ma, L Wu Communications in Mathematical Sciences, 2019 17 (5), 1407-1425, 2018 | 113 | 2018 |
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't C Ma, S Wojtowytsch, L Wu CSIAM Trans. Appl. Math. 1 (4), 561-615, 2020 | 111 | 2020 |
Bispectrum inversion with application to multireference alignment T Bendory, N Boumal, C Ma, Z Zhao, A Singer IEEE Transactions on signal processing 66 (4), 1037-1050, 2017 | 97 | 2017 |
A comparative analysis of the optimization and generalization property of two-layer neural network and random feature models under gradient descent dynamics C Ma, L Wu Science China Mathematics 63 (No. 7), 1235–1258, 2019 | 91* | 2019 |
A mean field analysis of deep resnet and beyond: Towards provably optimization via overparameterization from depth Y Lu, C Ma, Y Lu, J Lu, L Ying International Conference on Machine Learning, 6426-6436, 2020 | 85 | 2020 |
Uniformly Accurate Machine Learning Based Hydrodynamic Models for Kinetic Equations J Han, C Ma, Z Ma, W E Proceedings of the National Academy of Sciences, 2019, 2019 | 84 | 2019 |
Model reduction with memory and the machine learning of dynamical systems C Ma, J Wang Commun. Comput. Phys., 25 (2019), pp. 947-962., 2018 | 75 | 2018 |
Machine learning from a continuous viewpoint, I C Ma, L Wu Science China Mathematics 63 (11), 2233-2266, 2020 | 67 | 2020 |
Modeling subgrid-scale force and divergence of heat flux of compressible isotropic turbulence by artificial neural network C Xie, K Li, C Ma, J Wang Physical Review Fluids 4 (10), 104605, 2019 | 63 | 2019 |
Artificial neural network approach to large-eddy simulation of compressible isotropic turbulence C Xie, J Wang, K Li, C Ma PHYSICAL REVIEW E, 2019 | 62 | 2019 |
Rademacher complexity and the generalization error of residual networks C Ma, Q Wang Communications in Mathematical Sciences 18 (6), 1755-1774, 2020 | 57* | 2020 |
On linear stability of sgd and input-smoothness of neural networks C Ma, L Ying Advances in Neural Information Processing Systems 34, 16805-16817, 2021 | 56* | 2021 |
Global convergence of gradient descent for deep linear residual networks L Wu, Q Wang, C Ma Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 2019 | 27 | 2019 |
Heterogeneous multireference alignment for images with application to 2D classification in single particle reconstruction C Ma, T Bendory, N Boumal, F Sigworth, A Singer IEEE Transactions on Image Processing 29, 1699-1710, 2019 | 27 | 2019 |
Globally convergent Levenberg-Marquardt method for phase retrieval C Ma, X Liu, Z Wen IEEE Transactions on Information Theory 65 (4), 2343-2359, 2018 | 26 | 2018 |
Beyond the quadratic approximation: the multiscale structure of neural network loss landscapes C Ma, D Kunin, L Wu, L Ying arXiv preprint arXiv:2204.11326, 2022 | 25 | 2022 |
The Multiscale Structure of Neural Network Loss Functions: The Effect on Optimization and Origin C Ma, D Kunin, L Wu, L Ying arXiv preprint arXiv:2204.11326, 2022 | 24 | 2022 |