Generative adversarial nets I Goodfellow, J Pouget-Abadie, M Mirza, B Xu, D Warde-Farley, S Ozair, ... Advances in neural information processing systems 27, 2014 | 78928* | 2014 |
Deep learning Y LeCun, Y Bengio, G Hinton nature 521 (7553), 436-444, 2015 | 78614 | 2015 |
Gradient-based learning applied to document recognition Y LeCun, L Bottou, Y Bengio, P Haffner Proceedings of the IEEE 86 (11), 2278-2324, 1998 | 65578 | 1998 |
Deep learning I Goodfellow, Y Bengio, A Courville MIT press, 2016 | 64101 | 2016 |
Neural machine translation by jointly learning to align and translate D Bahdanau, K Cho, Y Bengio arXiv preprint arXiv:1409.0473, 2014 | 34697 | 2014 |
Learning phrase representations using RNN encoder-decoder for statistical machine translation K Cho, B Van Merriënboer, C Gulcehre, D Bahdanau, F Bougares, ... arXiv preprint arXiv:1406.1078, 2014 | 29491 | 2014 |
Understanding the difficulty of training deep feedforward neural networks X Glorot, Y Bengio Proceedings of the thirteenth international conference on artificial …, 2010 | 24424 | 2010 |
Graph attention networks P Velickovic, G Cucurull, A Casanova, A Romero, P Lio, Y Bengio stat 1050 (20), 10-48550, 2017 | 20869* | 2017 |
Empirical evaluation of gated recurrent neural networks on sequence modeling J Chung, C Gulcehre, KH Cho, Y Bengio arXiv preprint arXiv:1412.3555, 2014 | 16150 | 2014 |
Representation learning: A review and new perspectives Y Bengio, A Courville, P Vincent IEEE transactions on pattern analysis and machine intelligence 35 (8), 1798-1828, 2013 | 15679 | 2013 |
Learning deep architectures for AI Y Bengio Foundations and trends® in Machine Learning 2 (1), 1-127, 2009 | 12413 | 2009 |
Learning long-term dependencies with gradient descent is difficult Y Bengio, P Simard, P Frasconi IEEE transactions on neural networks 5 (2), 157-166, 1994 | 12350 | 1994 |
Show, attend and tell: Neural image caption generation with visual attention K Xu, J Ba, R Kiros, K Cho, A Courville, R Salakhudinov, R Zemel, ... International conference on machine learning, 2048-2057, 2015 | 12328 | 2015 |
Deep sparse rectifier neural networks X Glorot, A Bordes, Y Bengio Proceedings of the fourteenth international conference on artificial …, 2011 | 12168 | 2011 |
Random search for hyper-parameter optimization. J Bergstra, Y Bengio Journal of machine learning research 13 (2), 2012 | 12122 | 2012 |
A Neural probabilistic language model Y Bengio, R Ducharme, P Vincent Journal of Machine Learning Research 3, 1137-1155, 2003 | 11405 | 2003 |
How transferable are features in deep neural networks? J Yosinski, J Clune, Y Bengio, H Lipson Advances in neural information processing systems 27, 2014 | 10507 | 2014 |
Extracting and composing robust features with denoising autoencoders P Vincent, H Larochelle, Y Bengio, PA Manzagol Proceedings of the 25th international conference on Machine learning, 1096-1103, 2008 | 9027 | 2008 |
Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. P Vincent, H Larochelle, I Lajoie, Y Bengio, PA Manzagol, L Bottou Journal of machine learning research 11 (12), 2010 | 8946 | 2010 |
On the properties of neural machine translation: Encoder-decoder approaches K Cho, B Van Merriënboer, D Bahdanau, Y Bengio arXiv preprint arXiv:1409.1259, 2014 | 8762 | 2014 |