Theano: A Python framework for fast computation of mathematical expressions R Al-Rfou, G Alain, A Almahairi, C Angermueller, D Bahdanau, N Ballas, ... arXiv, arXiv: 1605.02688, 2016 | 978 | 2016 |
Towards end-to-end speech recognition with deep convolutional neural networks Y Zhang, M Pezeshki, P Brakel, S Zhang, CLY Bengio, A Courville arXiv preprint arXiv:1701.02720, 2017 | 484 | 2017 |
Zoneout: Regularizing rnns by randomly preserving hidden activations D Krueger, T Maharaj, J Kramár, M Pezeshki, N Ballas, NR Ke, A Goyal, ... arXiv preprint arXiv:1606.01305, 2016 | 383 | 2016 |
Gradient starvation: A learning proclivity in neural networks M Pezeshki, O Kaba, Y Bengio, AC Courville, D Precup, G Lajoie Advances in Neural Information Processing Systems 34, 1256-1272, 2021 | 261 | 2021 |
Theano: A Python framework for fast computation of mathematical expressions TTD Team, R Al-Rfou, G Alain, A Almahairi, C Angermueller, D Bahdanau, ... arXiv preprint arXiv:1605.02688, 2016 | 214 | 2016 |
Negative momentum for improved game dynamics G Gidel, RA Hemmat, M Pezeshki, R Le Priol, G Huang, S Lacoste-Julien, ... The 22nd International Conference on Artificial Intelligence and Statistics …, 2019 | 198 | 2019 |
Simple data balancing achieves competitive worst-group-accuracy BY Idrissi, M Arjovsky, M Pezeshki, D Lopez-Paz Conference on Causal Learning and Reasoning, 336-351, 2022 | 138 | 2022 |
Deconstructing the Ladder Network Architecture M Pezeshki, L Fan, P Brakel, A Courville, Y Bengio arXiv preprint arXiv:1511.06430, 2015 | 134 | 2015 |
Theano: A Python framework for fast computation of mathematical expressions. arXiv R Al-Rfou, G Alain, A Almahairi, C Angermueller, D Bahdanau, N Ballas, ... arXiv preprint arXiv:1605.02688 10, 2016 | 49 | 2016 |
On the learning dynamics of deep neural networks R Tachet, M Pezeshki, S Shabanian, A Courville, Y Bengio arXiv preprint arXiv:1809.06848, 2018 | 40* | 2018 |
Multi-scale Feature Learning Dynamics: Insights for Double Descent M Pezeshki, A Mitra, Y Bengio, G Lajoie https://arxiv.org/pdf/2112.03215.pdf, 2021 | 20 | 2021 |
Comparison three methods of clustering: K-means, spectral clustering and hierarchical clustering K Kowsari, T Borsche, A Ulbig, G Andersson, AM Saxe, JL McClelland, ... arXiv Preprint, 2013 | 16* | 2013 |
Sequence modeling using gated recurrent neural networks M Pezeshki arXiv preprint arXiv:1501.00299, 2015 | 15 | 2015 |
Deep belief networks for image denoising MA Keyvanrad, M Pezeshki, MA Homayounpour arXiv preprint arXiv:1312.6158, 2013 | 11 | 2013 |
Predicting grokking long before it happens: A look into the loss landscape of models which grok P Notsawo Jr, H Zhou, M Pezeshki, I Rish, G Dumas arXiv preprint arXiv:2306.13253, 2023 | 7 | 2023 |
Feedback-guided Data Synthesis for Imbalanced Classification R Askari Hemmat, M Pezeshki, F Bordes, M Drozdzal, A Romero-Soriano arXiv e-prints, arXiv: 2310.00158, 2023 | 5* | 2023 |
Discovering environments with XRM M Pezeshki, D Bouchacourt, M Ibrahim, N Ballas, P Vincent, D Lopez-Paz arXiv preprint arXiv:2309.16748, 2023 | 2 | 2023 |
Dynamics of learning and generalization in neural networks M Pezeshki | | 2022 |