A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias

Y Xu, L Hu, J Zhao, Z Qiu, Y Ye, H Gu - arXiv preprint arXiv:2404.00929, 2024 - arxiv.org
Based on the foundation of Large Language Models (LLMs), Multilingual Large Language
Models (MLLMs) have been developed to address the challenges of multilingual natural …

CIRAL: A Test Collection for CLIR Evaluations in African Languages

M Adeyemi, A Oladipo, X Zhang… - Proceedings of the 47th …, 2024 - dl.acm.org
Cross-lingual information retrieval (CLIR) continues to be an actively studied topic in
information retrieval (IR), and there have been consistent efforts in curating test collections to …

Somali Information Retrieval Corpus: Bridging the Gap between Query Translation and Dedicated Language Resources

A Badel, T Zhong, W Tai, F Zhou - Proceedings of the 2023 …, 2023 - aclanthology.org
Despite the growing use of the Somali language in various online domains, research on
Somali language information retrieval remains limited and primarily relies on query …

Cross-Lingual Information Retrieval in a Hybrid Query Model for Optimality

A Basit, I Hanif, MS Maqbool, W Qayyum… - Journal of Computing & …, 2023 - jcbi.org
Abstract Cross-Lingual Information Retrieval (CLIR) allows users to get the documents in the
language other than the query language. It is accomplished in two ways: In first method the …

Cross-Lingual Information Retrieval from Multilingual Construction Documents Using Pretrained Language Models

J Kim, S Chung, S Chi - Journal of Construction Engineering and …, 2024 - ascelibrary.org
The growth of the global construction market has attracted international companies to
participate in overseas projects. Overseas projects are extremely dynamic with numerous …

CIRAL at FIRE 2023: Cross-Lingual Information Retrieval for African Languages

M Adeyemi, A Oladipo, X Zhang… - Proceedings of the 15th …, 2023 - dl.acm.org
This paper provides a short overview of the CIRAL track at the Forum for Information
Retrieval Evaluation (FIRE) 2023. CIRAL focused on cross-lingual information retrieval …

On Backbones and Training Regimes for Dense Retrieval in African Languages

A Oladipo, M Adeyemi, J Lin - Proceedings of the 47th International ACM …, 2024 - dl.acm.org
The effectiveness of dense retrieval models trained with multilingual language models as
backbones has been demonstrated in multilingual and cross-lingual information retrieval …

State of NLP in Kenya: A Survey

CJ Amol, EA Chimoto, RD Gesicho, AM Gitau… - arXiv preprint arXiv …, 2024 - arxiv.org
Kenya, known for its linguistic diversity, faces unique challenges and promising
opportunities in advancing Natural Language Processing (NLP) technologies, particularly …

What are the limits of cross-lingual dense passage retrieval for low-resource languages?

J Wu, Z Ren, S Verberne - arXiv preprint arXiv:2408.11942, 2024 - arxiv.org
In this paper, we analyze the capabilities of the multi-lingual Dense Passage Retriever
(mDPR) for extremely low-resource languages. In the Cross-lingual Open-Retrieval Answer …

Multilinguality in Misinformation Detection

A Ekbal, R Kumari - Dive into Misinformation Detection: From Unimodal to …, 2024 - Springer
This chapter examines the value of multilingualism and how it can be used to address
different NLP issues. It first presents a multilingual multimodal misinformation dataset and …