UL2: Unifying Language Learning Paradigms Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei, X Wang, HW Chung, ... ICLR 2023, 2022 | 321* | 2022 |
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning V Aribandi, Y Tay, T Schuster, J Rao, HS Zheng, SV Mehta, H Zhuang, ... ICLR 2022, 2021 | 183 | 2021 |
Transformer memory as a differentiable search index Y Tay, VQ Tran, M Dehghani, J Ni, D Bahri, H Mehta, Z Qin, K Hui, Z Zhao, ... NeurIPS 2022, 2022 | 174 | 2022 |
Charformer: Fast character transformers via gradient-based subword tokenization Y Tay, VQ Tran, S Ruder, J Gupta, HW Chung, D Bahri, Z Qin, ... ICLR 2022, 2021 | 122 | 2021 |
A new generation of perspective api: Efficient multilingual character-level transformers A Lees, VQ Tran, Y Tay, J Sorensen, J Gupta, D Metzler, L Vasserman KDD'22 ADS, 2022 | 117 | 2022 |
Confident adaptive language modeling T Schuster, A Fisch, J Gupta, M Dehghani, D Bahri, VQ Tran, Y Tay, ... NeurIPS 2022, 2022 | 108 | 2022 |
Attributed question answering: Evaluation and modeling for attributed large language models B Bohnet, VQ Tran, P Verga, R Aharoni, D Andor, LB Soares, M Ciaramita, ... arXiv preprint arXiv:2212.08037, 2022 | 76 | 2022 |
Scaling laws vs model architectures: How does inductive bias influence scaling? Y Tay, M Dehghani, S Abnar, HW Chung, W Fedus, J Rao, S Narang, ... EMNLP 2023 Findings, 2022 | 57 | 2022 |
Recommender Systems with Generative Retrieval S Rajput, N Mehta, A Singh, RH Keshavan, T Vu, L Heldt, L Hong, Y Tay, ... NeurIPS 2023, 2023 | 56 | 2023 |
Making the case for Query-by-Voice with EchoQuery G Lyons, V Tran, C Binnig, U Cetintemel, T Kraska SIGMOD 2016, 2129-2132, 2016 | 55 | 2016 |
Transcending scaling laws with 0.1% extra compute Y Tay, J Wei, HW Chung, VQ Tran, DR So, S Shakeri, X Garcia, HS Zheng, ... EMNLP 2023, 2022 | 50 | 2022 |
How Does Generative Retrieval Scale to Millions of Passages? R Pradeep, K Hui, J Gupta, AD Lelkes, H Zhuang, J Lin, D Metzler, ... EMNLP 2023, 2023 | 43 | 2023 |
Quiz-Style Question Generation for News Stories AD Lelkes, VQ Tran, C Yu WWW '21: Proceedings of the Web Conference 2021, Pages 2501–2511, 2021 | 42 | 2021 |
DSI++: Updating Transformer Memory with New Documents SV Mehta, J Gupta, Y Tay, M Dehghani, VQ Tran, J Rao, M Najork, ... EMNLP 2023, 2022 | 35 | 2022 |
AgreeSum: Agreement-Oriented Multi-Document Summarization RY Pang, AD Lelkes, VQ Tran, C Yu ACL-IJCNLP 2021 Findings, 3377–3391, 2021 | 17 | 2021 |
Fractal Patterns May Unravel the Intelligence in Next-Token Prediction I Alabdulmohsin, VQ Tran, M Dehghani arXiv preprint arXiv:2402.01825, 2024 | 1 | 2024 |
Efficient Decoding of Output Sequences Using Adaptive Early Exiting T Schuster, AJ Fisch, JP Gupta, M Dehghani, D Bahri, VQ Tran, Y Tay, ... US Patent App. 18/222,395, 2024 | | 2024 |
Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification J Gupta, Y Tay, C Kamath, VQ Tran, D Metzler, S Bavadekar, M Sun, ... EMNLP 2022 Industry, 2022 | | 2022 |
Crossword puzzle generator A Lelkes, C Keogh, RMH Gaughan III, K Tempero, C Yu, VQ Tran, ... US Patent 10,967,248, 2021 | | 2021 |