Approaches to corpus creation for low-resource language technology: the case of Southern Kurdish and Laki
One of the major challenges that under-represented and endangered language
communities face in language technology is the lack or paucity of language data. This is …
communities face in language technology is the lack or paucity of language data. This is …
CODET: A benchmark for contrastive dialectal evaluation of machine translation
Neural machine translation (NMT) systems exhibit limited robustness in handling source-
side linguistic variations. Their performance tends to degrade when faced with even slight …
side linguistic variations. Their performance tends to degrade when faced with even slight …
Towards machine translation for the Kurdish language
S Ahmadi, M Masoud - arXiv preprint arXiv:2010.06041, 2020 - arxiv.org
Machine translation is the task of translating texts from one language to another using
computers. It has been one of the major tasks in natural language processing and …
computers. It has been one of the major tasks in natural language processing and …
Research on the Application of Translation Parallel Corpus in Interpretation Teaching
Y Gong, L Cheng - ACM Transactions on Asian and Low-Resource …, 2023 - dl.acm.org
Large and organized sets of translated texts between languages are called parallel
translation corpora (PTLs). Even though data-driven learning can generate insights from …
translation corpora (PTLs). Even though data-driven learning can generate insights from …
Language and Speech Technology for Central Kurdish Varieties
Kurdish, an Indo-European language spoken by over 30 million speakers, is considered a
dialect continuum and known for its diversity in language varieties. Previous studies …
dialect continuum and known for its diversity in language varieties. Previous studies …
Making Old Kurdish Publications Processable by Augmenting Available Optical Character Recognition Engines
B Yaseen, H Hassani - arXiv preprint arXiv:2404.06101, 2024 - arxiv.org
Kurdish libraries have many historical publications that were printed back in the early days
when printing devices were brought to Kurdistan. Having a good Optical Character …
when printing devices were brought to Kurdistan. Having a good Optical Character …
Automatically temporal labeled data generation using positional lexicon expansion for focus time estimation of news articles
U Ahmed, JCW Lin, V Garcia Diaz - ACM Transactions on Asian and …, 2024 - dl.acm.org
Many facts change over time, which is a fundamental aspect of our physical environment. In
the case of pandemic articles, the user is not interested in the creation date of the document …
the case of pandemic articles, the user is not interested in the creation date of the document …
WH2D2N2: Distributed AI-enabled OK-ASN Service for Web of Things
Model data-driven ontology and knowledge presentation for evolving semantic Asian social
networks (OK-ASN) is a critical strategy for web of things (WoT) services. Meanwhile, Deep …
networks (OK-ASN) is a critical strategy for web of things (WoT) services. Meanwhile, Deep …
Mizo to english machine translation: An evaluation benchmark
Speech is the most natural method for people to convey emotions and communicate.
Traditional input techniques are used for machine communication. Communication across …
Traditional input techniques are used for machine communication. Communication across …
Central Kurdish machine translation: First large scale parallel corpus and experiments
Z Amini, M Mohammadamini, H Hosseini… - arXiv preprint arXiv …, 2021 - arxiv.org
While the computational processing of Kurdish has experienced a relative increase, the
machine translation of this language seems to be lacking a considerable body of scientific …
machine translation of this language seems to be lacking a considerable body of scientific …