Noam Shazeer 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	175322	170274
h 指数	58	53
i10 指数	92	76

57000

28500

14250

42750

20172018201920202021202220232024665 2351 6935 13606 23515 36619 56941 32583

开放获取的出版物数量

查看全部

1 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

关注

Noam Shazeer

Character.ai

在 character.ai 的电子邮件经过验证

Deep Learning


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Attention is all you need A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Advances in neural information processing systems 30, 2017	125328	2017
Exploring the limits of transfer learning with a unified text-to-text transformer C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... Journal of machine learning research 21 (140), 1-67, 2020	15950	2020
Attention is all you need [J] A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez Advances in neural information processing systems 30 (1), 261-272, 2017	5256*	2017
Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... Journal of Machine Learning Research 24 (240), 1-113, 2023	3823	2023
Scheduled sampling for sequence prediction with recurrent neural networks S Bengio, O Vinyals, N Jaitly, N Shazeer Advances in neural information processing systems 28, 2015	2270	2015
Outrageously large neural networks: The sparsely-gated mixture-of-experts layer N Shazeer, A Mirhoseini, K Maziarz, A Davis, Q Le, G Hinton, J Dean arXiv preprint arXiv:1701.06538, 2017	1952	2017
Image transformer N Parmar, A Vaswani, J Uszkoreit, L Kaiser, N Shazeer, A Ku, D Tran International conference on machine learning, 4055-4064, 2018	1874	2018
Advances in neural information processing systems A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Attention is all you need, 2017	1809	2017
Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity W Fedus, B Zoph, N Shazeer Journal of Machine Learning Research 23 (120), 1-39, 2022	1441	2022
Exploring the limits of language modeling R Jozefowicz, O Vinyals, M Schuster, N Shazeer, Y Wu arXiv preprint arXiv:1602.02410, 2016	1377	2016
Lamda: Language models for dialog applications R Thoppilan, D De Freitas, J Hall, N Shazeer, A Kulshreshtha, HT Cheng, ... arXiv preprint arXiv:2201.08239, 2022	1217	2022
Generating wikipedia by summarizing long sequences PJ Liu, M Saleh, E Pot, B Goodrich, R Sepassi, L Kaiser, N Shazeer arXiv preprint arXiv:1801.10198, 2018	928	2018
Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017 V Ashish, S Noam, P Niki, U Jakob, J Llion Attention is all you need. In Advances in neural information processing …, 2017	868	2017
Music transformer CZA Huang, A Vaswani, J Uszkoreit, N Shazeer, I Simon, C Hawthorne, ... arXiv preprint arXiv:1809.04281, 2018	847	2018
Adafactor: Adaptive learning rates with sublinear memory cost N Shazeer, M Stern International Conference on Machine Learning, 4596-4604, 2018	823	2018
End-to-end text-dependent speaker verification G Heigold, I Moreno, S Bengio, N Shazeer 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016	768	2016
Gshard: Scaling giant models with conditional computation and automatic sharding D Lepikhin, HJ Lee, Y Xu, D Chen, O Firat, Y Huang, M Krikun, N Shazeer, ... arXiv preprint arXiv:2006.16668, 2020	762	2020
How much knowledge can you pack into the parameters of a language model? A Roberts, C Raffel, N Shazeer arXiv preprint arXiv:2002.08910, 2020	724	2020
Tensor2tensor for neural machine translation A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ... arXiv preprint arXiv:1803.07416, 2018	622	2018
Glu variants improve transformer N Shazeer arXiv preprint arXiv:2002.05202, 2020	421	2020

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

引用