Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1294 | 2023 |
Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Victor Mustar, François Lagunas, Alexander Rush, and Thomas Wolf. 2021. Datasets: A community library for … Q Lhoest, AV Del Moral, Y Jernite, A Thakur, P Von Platen, S Patil, ... Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021 | 215 | 2021 |
Datasets: A community library for natural language processing Q Lhoest, AV del Moral, Y Jernite, A Thakur, P von Platen, S Patil, ... arXiv preprint arXiv:2109.02846, 2021 | 199 | 2021 |
The bigscience roots corpus: A 1.6 tb composite multilingual dataset H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ... Advances in Neural Information Processing Systems 35, 31809-31826, 2022 | 123 | 2022 |
Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Victor Mustar, François Lagunas, Alexander M Q Lhoest, AV del Moral, Y Jernite, A Thakur, P von Platen, S Patil, ... Rush, and Thomas Wolf, 2021 | 16 | 2021 |
Evaluate & evaluation on the hub: Better best practices for data and model measurements L Von Werra, L Tunstall, A Thakur, S Luccioni, T Thrush, A Piktus, F Marty, ... Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 14 | 2022 |
Loubna Ben allal H Laurençon, L Saulnier, T Wang, C Akiki, AV del Moral, T Le Scao, ... | 12 | 2022 |