Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 852 | 2023 |
Improving massively multilingual neural machine translation and zero-shot translation B Zhang, P Williams, I Titov, R Sennrich ACL, 2020 | 320 | 2020 |
Root mean square layer normalization B Zhang, R Sennrich NeurIPS, 2019 | 272 | 2019 |
Revisiting low-resource neural machine translation: A case study R Sennrich, B Zhang ACL, 2019 | 271 | 2019 |
Variational neural machine translation B Zhang, D Xiong, J Su, H Duan, M Zhang EMNLP, 2016 | 231 | 2016 |
Neural machine translation with GRU-gated attention model B Zhang, D Xiong, J Xie, J Su TNNLS 31 (11), 4688-4698, 2020 | 132* | 2020 |
Shallow convolutional neural network for implicit discourse relation recognition B Zhang, J Su, D Xiong, Y Lu, H Duan, J Yao EMNLP, 2230-2235, 2015 | 129 | 2015 |
Accelerating neural transformer via an average attention network B Zhang, D Xiong, J Su ACL, 2018 | 128 | 2018 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 126 | 2024 |
Prompting large language model for machine translation: A case study B Zhang, B Haddow, A Birch International Conference on Machine Learning, 41092-41110, 2023 | 115 | 2023 |
Neural machine translation with deep attention B Zhang, D Xiong, J Su TPAMI 42 (1), 154-163, 2018 | 115 | 2018 |
Improving deep transformer with depth-scaled initialization and merged attention B Zhang, I Titov, R Sennrich EMNLP, 2019 | 94 | 2019 |
Variational recurrent neural machine translation J Su, S Wu, D Xiong, Y Lu, X Han, B Zhang Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018 | 92 | 2018 |
Share or not? learning to schedule language-specific capacity for multilingual translation B Zhang, A Bapna, R Sennrich, O Firat ICLR, 2020 | 81 | 2020 |
A context-aware recurrent encoder for neural machine translation B Zhang, D Xiong, J Su, H Duan TASLP 25 (12), 2424-2432, 2017 | 71 | 2017 |
Adaptive feature selection for end-to-end speech translation B Zhang, I Titov, B Haddow, R Sennrich EMNLP Findings, 2020 | 41 | 2020 |
Sparse Attention with Linear Units B Zhang, I Titov, R Sennrich EMNLP, 2021 | 35 | 2021 |
Madlad-400: A multilingual and document-level large audited dataset S Kudugunta, I Caswell, B Zhang, X Garcia, D Xin, A Kusupati, R Stella, ... Advances in Neural Information Processing Systems 36, 2024 | 34 | 2024 |
Data Scaling Laws in NMT: The Effect of Noise and Architecture Y Bansal, B Ghorbani, A Garg, B Zhang, M Krikun, C Cherry, B Neyshabur, ... ICML, 2022 | 31 | 2022 |
A neural generative autoencoder for bilingual word embeddings J Su, S Wu, B Zhang, C Wu, Y Qin, D Xiong Information Sciences 424, 287-300, 2018 | 31 | 2018 |