Deep double descent: Where bigger models and more data hurt P Nakkiran, G Kaplun, Y Bansal, T Yang, B Barak, I Sutskever International Conference on Learning Representations (ICLR) 2019, 2019 | 1006 | 2019 |
SGD on Neural Networks Learns Functions of Increasing Complexity P Nakkiran, G Kaplun, D Kalimeris, T Yang, B Edelman, H Zhang, B Barak Advances in Neural Information Processing Systems, 3491-3501, 2019 | 239* | 2019 |
Having your cake and eating it too: Jointly optimal erasure codes for {I/O}, storage, and network-bandwidth KV Rashmi, P Nakkiran, J Wang, NB Shah, K Ramchandran 13th USENIX Conference on File and Storage Technologies (FAST 15), 81-94, 2015 | 170 | 2015 |
Adversarial robustness may be at odds with simplicity P Nakkiran arXiv preprint arXiv:1901.00532, 2019 | 134 | 2019 |
Optimal regularization can mitigate double descent P Nakkiran, P Venkat, S Kakade, T Ma arXiv preprint arXiv:2003.01897, 2020 | 130 | 2020 |
Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks R Prabhavalkar, R Alvarez, C Parada, P Nakkiran, TN Sainath 2015 IEEE International Conference on Acoustics, Speech and Signal …, 2015 | 108 | 2015 |
Compressing deep neural networks using a rank-constrained topology. P Nakkiran, R Alvarez, R Prabhavalkar, C Parada INTERSPEECH, 1473-1477, 2015 | 96 | 2015 |
Revisiting model stitching to compare neural representations Y Bansal, P Nakkiran, B Barak Advances in neural information processing systems 34, 225-236, 2021 | 82 | 2021 |
The deep bootstrap framework: Good online learners are good offline generalizers P Nakkiran, B Neyshabur, H Sedghi International Conference on Learning Representations (ICLR) 2021, 2020 | 71* | 2020 |
More data can hurt for linear regression: Sample-wise double descent P Nakkiran arXiv preprint arXiv:1912.07242, 2019 | 67 | 2019 |
Limitations of neural collapse for understanding generalization in deep learning L Hui, M Belkin, P Nakkiran arXiv preprint arXiv:2202.08384, 2022 | 52 | 2022 |
Benign, tempered, or catastrophic: Toward a refined taxonomy of overfitting N Mallinar, J Simon, A Abedsoltan, P Pandit, M Belkin, P Nakkiran Advances in Neural Information Processing Systems 35, 1182-1195, 2022 | 51 | 2022 |
What algorithms can transformers learn? a study in length generalization H Zhou, A Bradley, E Littwin, N Razin, O Saremi, J Susskind, S Bengio, ... arXiv preprint arXiv:2310.16028, 2023 | 50 | 2023 |
Computational Limitations in Robust Classification and Win-Win Results A Degwekar, P Nakkiran, V Vaikuntanathan Proceedings of the Thirty-Second Conference on Learning Theory 99, 994-1028, 2019 | 39 | 2019 |
Distributional generalization: A new kind of generalization P Nakkiran, Y Bansal arXiv preprint arXiv:2009.08092, 2020 | 37 | 2020 |
Rank-constrained neural networks RA Guevara, P Nakkiran US Patent 9,767,410, 2017 | 28 | 2017 |
General strong polarization J Błasiok, V Guruswami, P Nakkiran, A Rudra, M Sudan ACM Journal of the ACM (JACM) 69 (2), 1-67, 2022 | 27 | 2022 |
Learning rate annealing can provably help generalization, even for convex problems P Nakkiran arXiv preprint arXiv:2005.07360, 2020 | 25 | 2020 |
A Discussion of'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Examples are Just Bugs, Too P Nakkiran Distill 4 (8), e00019. 5, 2019 | 24 | 2019 |
Limitations of the ntk for understanding generalization in deep learning N Vyas, Y Bansal, P Nakkiran arXiv preprint arXiv:2206.10012, 2022 | 23 | 2022 |