Exploring the limits of transfer learning with a unified text-to-text transformer C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... The Journal of Machine Learning Research 21 (1), 5485-5551, 2020 | 16005 | 2020 |
Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... Journal of Machine Learning Research 24 (240), 1-113, 2023 | 3859 | 2023 |
Extracting Training Data from Large Language Models. N Carlini, F Tramer, E Wallace, M Jagielski, A Herbert-Voss, K Lee, ... USENIX Security Symposium 6, 2021 | 1361 | 2021 |
PaLM 2 Technical Report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 983 | 2023 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 816 | 2023 |
Quantifying Memorization Across Neural Language Models N Carlini, D Ippolito, M Jagielski, K Lee, F Tramer, C Zhang arXiv preprint arXiv:2202.07646, 2022 | 439 | 2022 |
Deduplicating training data makes language models better K Lee, D Ippolito, A Nystrom, C Zhang, D Eck, C Callison-Burch, N Carlini arXiv preprint arXiv:2107.06499, 2021 | 393 | 2021 |
WT5?! Training Text-to-Text Models to Explain their Predictions S Narang, C Raffel, K Lee, A Roberts, N Fiedel, K Malkan arXiv preprint arXiv:2004.14546, 2020 | 174 | 2020 |
Gemma: Open Models Based on Gemini Research and Technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 171 | 2024 |
Are aligned neural networks adversarially aligned? N Carlini, M Nasr, CA Choquette-Choo, M Jagielski, I Gao, PWW Koh, ... Advances in Neural Information Processing Systems 36, 2024 | 143 | 2024 |
What Does it Mean for a Language Model to Preserve Privacy? H Brown, K Lee, F Mireshghallah, R Shokri, F Tramèr 2022 ACM Conference on Fairness, Accountability, and Transparency, 2280-2292, 2022 | 138 | 2022 |
Hallucinations in neural machine translation K Lee, O Firat, A Agarwal, C Fannjiang, D Sussillo | 122 | 2018 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 108 | 2024 |
Scalable Extraction of Training Data from (Production) Language Models M Nasr, N Carlini, J Hayase, M Jagielski, AF Cooper, D Ippolito, ... arXiv preprint arXiv:2311.17035, 2023 | 107 | 2023 |
Counterfactual memorization in neural language models C Zhang, D Ippolito, K Lee, M Jagielski, F Tramèr, N Carlini Advances in Neural Information Processing Systems 36, 39321-39362, 2023 | 89 | 2023 |
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy D Ippolito, F Tramèr, M Nasr, C Zhang, M Jagielski, K Lee, ... arXiv preprint arXiv:2210.17546, 2022 | 81 | 2022 |
Propagation of information along the cortical hierarchy as a function of attention while reading and listening to stories M Regev, E Simony, K Lee, KM Tan, J Chen, U Hasson Cerebral Cortex 29 (10), 4017-4034, 2019 | 72 | 2019 |
Measuring Forgetting of Memorized Training Examples M Jagielski, O Thakkar, F Tramèr, D Ippolito, K Lee, N Carlini, E Wallace, ... arXiv preprint arXiv:2207.00099, 2022 | 65 | 2022 |
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity S Longpre, G Yauney, E Reif, K Lee, A Roberts, B Zoph, D Zhou, J Wei, ... arXiv preprint arXiv:2305.13169, 2023 | 54 | 2023 |
Madlad-400: A multilingual and document-level large audited dataset S Kudugunta, I Caswell, B Zhang, X Garcia, D Xin, A Kusupati, R Stella, ... Advances in Neural Information Processing Systems 36, 2024 | 31 | 2024 |