On the difficulty of training recurrent neural networks R Pascanu, T Mikolov, Y Bengio International conference on machine learning, 1310-1318, 2013 | 7269 | 2013 |
Overcoming catastrophic forgetting in neural networks J Kirkpatrick, R Pascanu, N Rabinowitz, J Veness, G Desjardins, AA Rusu, ... Proceedings of the national academy of sciences 114 (13), 3521-3526, 2017 | 6962 | 2017 |
Relational inductive biases, deep learning, and graph networks PW Battaglia, JB Hamrick, V Bapst, A Sanchez-Gonzalez, V Zambaldi, ... arXiv preprint arXiv:1806.01261, 2018 | 3533 | 2018 |
Progressive neural networks AA Rusu, NC Rabinowitz, G Desjardins, H Soyer, J Kirkpatrick, ... arXiv preprint arXiv:1606.04671, 2016 | 2808 | 2016 |
On the number of linear regions of deep neural networks GF Montufar, R Pascanu, K Cho, Y Bengio Advances in neural information processing systems 27, 2014 | 2753 | 2014 |
Theano: a CPU and GPU math expression compiler J Bergstra, O Breuleux, F Bastien, P Lamblin, R Pascanu, G Desjardins, ... Proceedings of the Python for scientific computing conference (SciPy) 4 (3), 1-7, 2010 | 2022 | 2010 |
A simple neural network module for relational reasoning A Santoro, D Raposo, DG Barrett, M Malinowski, R Pascanu, P Battaglia, ... Advances in neural information processing systems 30, 2017 | 1851 | 2017 |
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization YN Dauphin, R Pascanu, C Gulcehre, K Cho, S Ganguli, Y Bengio Advances in neural information processing systems 27, 2014 | 1751 | 2014 |
Theano: new features and speed improvements F Bastien, P Lamblin, R Pascanu, J Bergstra, I Goodfellow, A Bergeron, ... arXiv preprint arXiv:1211.5590, 2012 | 1699 | 2012 |
Interaction networks for learning about objects, relations and physics P Battaglia, R Pascanu, M Lai, D Jimenez Rezende Advances in neural information processing systems 29, 2016 | 1584 | 2016 |
Meta-learning with latent embedding optimization AA Rusu, D Rao, J Sygnowski, O Vinyals, R Pascanu, S Osindero, ... arXiv preprint arXiv:1807.05960, 2018 | 1549 | 2018 |
How to construct deep recurrent neural networks R Pascanu, C Gulcehre, K Cho, Y Bengio arXiv preprint arXiv:1312.6026, 2013 | 1354 | 2013 |
Learning to navigate in complex environments P Mirowski, R Pascanu, F Viola, H Soyer, AJ Ballard, A Banino, M Denil, ... arXiv preprint arXiv:1611.03673, 2016 | 957 | 2016 |
Theano: A Python framework for fast computation of mathematical expressions R Al-Rfou, G Alain, A Almahairi, C Angermueller, D Bahdanau, N Ballas, ... arXiv e-prints, arXiv: 1605.02688, 2016 | 916 | 2016 |
Progress & compress: A scalable framework for continual learning J Schwarz, W Czarnecki, J Luketina, A Grabska-Barwinska, YW Teh, ... International conference on machine learning, 4528-4537, 2018 | 874 | 2018 |
Theano: A CPU and GPU Math Compiler in Python. J Bergstra, O Breuleux, F Bastien, P Lamblin, R Pascanu, G Desjardins, ... SciPy 4, 1-7, 2010 | 854 | 2010 |
Model compression via distillation and quantization A Polino, R Pascanu, D Alistarh arXiv preprint arXiv:1802.05668, 2018 | 779 | 2018 |
Understanding the exploding gradient problem R Pascanu, T Mikolov, Y Bengio CoRR, abs/1211.5063 2 (417), 1, 2012 | 770 | 2012 |
Policy distillation AA Rusu, SG Colmenarejo, C Gulcehre, G Desjardins, J Kirkpatrick, ... arXiv preprint arXiv:1511.06295, 2015 | 761 | 2015 |
Sharp minima can generalize for deep nets L Dinh, R Pascanu, S Bengio, Y Bengio International Conference on Machine Learning, 1019-1028, 2017 | 738 | 2017 |