A Closer Look at Memorization in Deep Networks D Arpit*, S Jastrzebski*, N Ballas*, D Krueger*, E Bengio, MS Kanwal, ... International Conference on Machine Learning 2017, 2017 | 1427 | 2017 |
Parameter-Efficient Transfer Learning for NLP N Houlsby, A Giurgiu*, S Jastrzebski*, B Morrone, Q Laroussilhe, ... International Conference on Machine Learning (ICML) 2019, 2019 | 1352 | 2019 |
Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening N Wu, J Phang, J Park, Y Shen, Z Huang, M Zorin, S Jastrzebski, T Févry, ... IEEE Trans Med Imaging, 2019 | 447 | 2019 |
Three factors influencing minima in SGD S Jastrzebski*, Z Kenton*, D Arpit, N Ballas, A Fischer, Y Bengio, ... International Conference on Artificial Neural Networks 2018; International …, 2017 | 428 | 2017 |
Molecule Attention Transformer Ł Maziarka, T Danel, S Mucha, K Rataj, J Tabor, S Jastrzębski NeurIPS 2019 workshop; arXiv preprint arXiv:2002.08264, 2020 | 124 | 2020 |
The Break-Even Point on Optimization Trajectories of Deep Neural Networks S Jastrzebski, M Szymczak, S Fort, D Arpit, J Tabor, K Cho, K Geras International Conference on Learning Algorithms (ICLR) 2020, 2020 | 109 | 2020 |
Residual connections encourage iterative inference S Jastrzebski*, D Arpit*, N Ballas, V Verma, T Che, Y Bengio International Conference on Learning Algorithms (ICLR) 2018, 2017 | 103 | 2017 |
On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length S Jastrzębski, Z Kenton, N Ballas, A Fischer, Y Bengio, A Storkey International Conference on Learning Algorithms (ICLR) 2019, 2019 | 93 | 2019 |
Learning to SMILE(S) S Jastrzebski, D Lesniak, WM Czarnecki International Conference on Learning Representation 2016 (Workshop track), 2016 | 93* | 2016 |
An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department FE Shamout, Y Shen, N Wu, A Kaku, J Park, T Makino, S Jastrzębski, ... NPJ digital medicine 4 (1), 80, 2021 | 91 | 2021 |
Learning to Compute Word Embeddings on the Fly D Bahdanau, T Bosc*, S Jastrzebski*, E Grefenstette, P Vincent, Y Bengio Montreal AI Symposium 2017, 2017 | 88 | 2017 |
How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks S Jastrzebski, D Leśniak, WM Czarnecki arXiv preprint arXiv:1702.02170, 2017 | 70 | 2017 |
Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits M Sacha, P Błaż, Mikołaj, Byrski, P Włodarczyk-Pruszyński, S Jastrzębski Journal of Cheminformatics and Modeling (JCIM), 2020 | 69 | 2020 |
Stiffness: A new perspective on generalization in neural networks S Fort, PK Nowak, S Jastrzebski, S Narayanan arXiv preprint arXiv:1901.09491, 2019 | 65 | 2019 |
Evolutionary-Neural Hybrid Agents for Architecture Search K Maziarz, A Khorlin, Q de Laroussilhe, S Jastrzebski, T Mingxing, ... arXiv preprint arXiv:1811.09828, 2018 | 65* | 2018 |
Large Scale Structure of Neural Network Loss Landscapes S Fort, S Jastrzebski NeurIPS 2019, 2019 | 60 | 2019 |
Cramer-Wold Auto-Encoder S Knop, P Spurek, J Tabor, I Podolak, M Mazur, S Jastrzębski Journal of Machine Learning Research 21 (164), 1-28, 2020 | 44* | 2020 |
Osprey: Hyperparameter optimization for machine learning R McGibbon, C Hernández, M Harrigan, S Kearnes, M Sultan, ... Journal of Open Source Software 1 (5), 34, 2016 | 43 | 2016 |
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization S Jastrzebski, D Arpit, O Astrand, G Kerg, H Wang, C Xiong, R Socher, ... International Conference on Machine Learning (ICML) 2021, 2020 | 39 | 2020 |
Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function W Tarnowski, P Warchoł, S Jastrzebski, J Tabor, M Nowak AISTATS 2019, 2018 | 31 | 2018 |