Approaches to corpus creation for low-resource language technology: the case of Southern Kurdish and Laki

S Ahmadi, Z Azin, S Belelli… - arXiv preprint arXiv …, 2023 - arxiv.org
One of the major challenges that under-represented and endangered language
communities face in language technology is the lack or paucity of language data. This is …

CODET: A benchmark for contrastive dialectal evaluation of machine translation

MMI Alam, S Ahmadi, A Anastasopoulos - arXiv preprint arXiv:2305.17267, 2023 - arxiv.org
Neural machine translation (NMT) systems exhibit limited robustness in handling source-
side linguistic variations. Their performance tends to degrade when faced with even slight …

Towards machine translation for the Kurdish language

S Ahmadi, M Masoud - arXiv preprint arXiv:2010.06041, 2020 - arxiv.org
Machine translation is the task of translating texts from one language to another using
computers. It has been one of the major tasks in natural language processing and …

Research on the Application of Translation Parallel Corpus in Interpretation Teaching

Y Gong, L Cheng - ACM Transactions on Asian and Low-Resource …, 2023 - dl.acm.org
Large and organized sets of translated texts between languages are called parallel
translation corpora (PTLs). Even though data-driven learning can generate insights from …

Language and Speech Technology for Central Kurdish Varieties

S Ahmadi, DQ Jaff, MMI Alam… - arXiv preprint arXiv …, 2024 - arxiv.org
Kurdish, an Indo-European language spoken by over 30 million speakers, is considered a
dialect continuum and known for its diversity in language varieties. Previous studies …

Making Old Kurdish Publications Processable by Augmenting Available Optical Character Recognition Engines

B Yaseen, H Hassani - arXiv preprint arXiv:2404.06101, 2024 - arxiv.org
Kurdish libraries have many historical publications that were printed back in the early days
when printing devices were brought to Kurdistan. Having a good Optical Character …

Automatically temporal labeled data generation using positional lexicon expansion for focus time estimation of news articles

U Ahmed, JCW Lin, V Garcia Diaz - ACM Transactions on Asian and …, 2024 - dl.acm.org
Many facts change over time, which is a fundamental aspect of our physical environment. In
the case of pandemic articles, the user is not interested in the creation date of the document …

WH2D2N2: Distributed AI-enabled OK-ASN Service for Web of Things

K Liang, R Ma, Y Hua, H Wang, N Hu, T Song… - ACM Transactions on …, 2023 - dl.acm.org
Model data-driven ontology and knowledge presentation for evolving semantic Asian social
networks (OK-ASN) is a critical strategy for web of things (WoT) services. Meanwhile, Deep …

Mizo to english machine translation: An evaluation benchmark

V Hnamte, H Thangkhanhau, J Hussain… - 2022 International …, 2022 - ieeexplore.ieee.org
Speech is the most natural method for people to convey emotions and communicate.
Traditional input techniques are used for machine communication. Communication across …

Central Kurdish machine translation: First large scale parallel corpus and experiments

Z Amini, M Mohammadamini, H Hosseini… - arXiv preprint arXiv …, 2021 - arxiv.org
While the computational processing of Kurdish has experienced a relative increase, the
machine translation of this language seems to be lacking a considerable body of scientific …