Participatory research for low-resourced machine translation: A case study in african languages W Nekoto, V Marivate, T Matsila, T Fasubaa, T Kolawole, T Fagbohungbe, ... EMNLP Findings 2021, 2020 | 155 | 2020 |
Quality at a glance: An audit of web-crawled multilingual datasets J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... Transactions of the Association for Computational Linguistics 10, 50-72, 2022 | 110 | 2022 |
MasakhaNER: Named entity recognition for African languages DI Adelani, J Abbott, G Neubig, D D’souza, J Kreutzer, C Lignos, ... Transactions of the Association for Computational Linguistics 9, 1116-1131, 2021 | 78 | 2021 |
Masakhane--Machine Translation For Africa I Orife, J Kreutzer, B Sibanda, D Whitenack, K Siminyu, L Martinus, JT Ali, ... arXiv preprint arXiv:2003.11529, 2020 | 60 | 2020 |
The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation O Ahia, J Kreutzer, S Hooker EMNLP Findings 2021, 2021 | 43 | 2021 |
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models O Ahia, S Kumar, H Gonen, J Kasai, DR Mortensen, NA Smith, Y Tsvetkov EMNLP 2023, 2023 | 34 | 2023 |
Pidginunmt: Unsupervised neural machine translation from west african pidgin to english K Ogueji, O Ahia arXiv preprint arXiv:1912.03444, 2019 | 20 | 2019 |
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition DI Adelani, G Neubig, S Ruder, S Rijhwani, M Beukman, C Palen-Michel, ... EMNLP 2022, 2022 | 19 | 2022 |
Towards supervised and unsupervised neural machine translation baselines for nigerian pidgin O Ahia, K Ogueji arXiv preprint arXiv:2003.12660, 2020 | 12 | 2020 |
Better Quality Pre-training Data and T5 Models for African Languages A Oladipo, M Adeyemi, O Ahia, A Owodunni, O Ogundepo, D Adelani, ... Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 8 | 2023 |
Cross-lingual Open-Retrieval Question Answering for African Languages O Ogundepo, T Gwadabe, C Rivera, JH Clark, S Ruder, D Adelani, ... Findings of the Association for Computational Linguistics: EMNLP 2023, 14957 …, 2023 | 6* | 2023 |
Intriguing Properties of Compression on Multilingual Models K Ogueji, O Ahia, G Onilude, S Gehrmann, S Hooker, J Kreutzer EMNLP 2022, 2022 | 6 | 2022 |
What a Creole Wants, What a Creole Needs H Lent, K Ogueji, M de Lhoneux, O Ahia, A Søgaard LREC 2022, 2022 | 6 | 2022 |
DIALECTBENCH: A NLP Benchmark for Dialects, Varieties, and Closely-Related Languages F Faisal, O Ahia, A Srivastava, K Ahuja, D Chiang, Y Tsvetkov, ... ACL 2024, 2024 | 4 | 2024 |
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling T Limisiewicz, T Blevins, H Gonen, O Ahia, L Zettlemoyer ACL 2024, 2024 | 3 | 2024 |
Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers R Xie, O Ahia, Y Tsvetkov, A Anastasopoulos NAACL 2024, 2024 | 1 | 2024 |
LEXPLAIN: Improving Model Explanations via Lexicon Supervision O Ahia, H Gonen, V Balachandran, Y Tsvetkov, NA Smith Proceedings of the The 12th Joint Conference on Lexical and Computational …, 2023 | 1 | 2023 |
AfriWOZ: Corpus for Exploiting Cross-Lingual Transfer for Dialogue Generation in Low-Resource, African Languages T Adewumi, M Adeyemi, A Anuoluwapo, B Peters, H Buzaaba, O Samuel, ... 2023 International Joint Conference on Neural Networks (IJCNN), 1-8, 2023 | 1 | 2023 |
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization O Ahia, S Kumar, H Gonen, V Hoffman, T Limisiewicz, Y Tsvetkov, ... arXiv preprint arXiv:2407.08818, 2024 | | 2024 |
Voices Unheard: NLP Resources and Models for Yor\ub\'a Regional Dialects O Ahia, A Aremu, D Abagyan, H Gonen, DI Adelani, D Abolade, NA Smith, ... arXiv preprint arXiv:2406.19564, 2024 | | 2024 |