Benign Overfitting in Two-layer ReLU Convolutional Neural Networks Y Kou, Z Chen, Y Chen, Q Gu International Conference on Machine Learning (ICML). 2023, 2023 | 22 | 2023 |
Why does sharpness-aware minimization generalize better than SGD? Z Chen, J Zhang, Y Kou, X Chen, CJ Hsieh, Q Gu Advances in neural information processing systems 36, 2024 | 10 | 2024 |
Implicit bias of gradient descent for two-layer reLU and leaky reLU networks on nearly-orthogonal data Y Kou, Z Chen, Q Gu Advances in Neural Information Processing Systems 36, 2024 | 9 | 2024 |
How Does Semi-supervised Learning with Pseudo-labelers Work? A Case Study Y Kou, Z Chen, Y Cao, Q Gu The Eleventh International Conference on Learning Representations, 2023 | 6 | 2023 |
Certified adversarial robustness under the bounded support set Y Kou, Q Zheng, Y Wang International Conference on Machine Learning, 11559-11597, 2022 | 3 | 2022 |
Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent Y Kou, Z Chen, Q Gu, SM Kakade arXiv preprint arXiv:2404.12376, 2024 | 1 | 2024 |
Fast Sampling via De-randomization for Discrete Diffusion Models Z Chen, H Yuan, Y Li, Y Kou, J Zhang, Q Gu arXiv preprint arXiv:2312.09193, 2023 | 1 | 2023 |
Guided Discrete Diffusion for Electronic Health Record Generation Z Chen, J Han, Y Li, Y Kou, E Halperin, RE Tillman, Q Gu arXiv preprint arXiv:2404.12314, 2024 | | 2024 |
Benign overfitting for two-layer relu networks Y Kou, Z Chen, Y Chen, Q Gu arXiv preprint arXiv:2303.04145, 2023 | | 2023 |
On the Power of Multitask Representation Learning with Gradient Descent Q Li, Z Chen, Y Deng, Y Kou, Y Cao, Q Gu | | |