Assembling the kazakh language corpus

A Mukhamadiyev, M Mukhiddinov, I Khujayarov… - Sensors, 2023 - mdpi.com

Automatic speech recognition systems with a large vocabulary and other natural language
processing applications cannot operate without a language model. Most studies on pre …

被引用次数：20 相关文章所有 12 个版本

[PDF] arxiv.org

Creating a morphological and syntactic tagged corpus for the Uzbek language

M Sharipov, J Mattiev, J Sobirov, R Baltayev - arXiv preprint arXiv …, 2022 - arxiv.org

Nowadays, creation of the tagged corpora is becoming one of the most important tasks of
Natural Language Processing (NLP). There are not enough tagged corpora to build …

被引用次数：29 相关文章所有 3 个版本

[PDF] arxiv.org

A crowdsourced open-source Kazakh speech corpus and initial speech recognition baseline

Y Khassanov, S Mussakhojayeva… - arXiv preprint arXiv …, 2020 - arxiv.org

We present an open-source speech corpus for the Kazakh language. The Kazakh speech
corpus (KSC) contains around 332 hours of transcribed audio comprising over 153,000 …

被引用次数：36 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] KSC2: An Industrial-Scale Open-Source Kazakh Speech Corpus.

S Mussakhojayeva, Y Khassanov, HA Varol - INTERSPEECH, 2022 - isca-archive.org

We present the first industrial-scale open-source Kazakh speech corpus for automatic
speech recognition research and development. Our corpus subsumes two previously …

被引用次数：20 相关文章所有 4 个版本

[PDF] arxiv.org

KazakhTTS: An open-source Kazakh text-to-speech synthesis dataset

S Mussakhojayeva, A Janaliyeva… - arXiv preprint arXiv …, 2021 - arxiv.org

This paper introduces a high-quality open-source speech synthesis dataset for Kazakh, a
low-resource language spoken by over 13 million people worldwide. The dataset consists of …

被引用次数：21 相关文章所有 6 个版本

[PDF] mdpi.com

Classification of scientific documents in the Kazakh language using deep neural networks and a fusion of images and text

A Bogdanchikov, D Ayazbayev, I Varlamis - Big Data and Cognitive …, 2022 - mdpi.com

The rapid development of natural language processing and deep learning techniques has
boosted the performance of related algorithms in several linguistic and text mining tasks …

被引用次数：7 相关文章所有 3 个版本

[PDF] scielo.org.mx

Semantic hyper-graph based representation of nouns in the Kazakh language

B Yergesh, A Mukanova, A Sharipbay… - Computacion y …, 2014 - scielo.org.mx

We explain how semantic hyper-graphs are used to describe ontological models of
morphological rules of agglutinative languages, with the Kazakh language as a case study …

被引用次数：40 相关文章所有 16 个版本

[PDF] ijscl.com

Developing an Online Kazakh-English-Russian Thesaurus of‎ Industry-Specific Terminology

AT Bayekeyeva, SZ Tazhibayeva, AA Shaheen… - International Journal of …, 2022 - ijscl.com

Industry-specific translation is one of the rapidly developing and highly demanded sectors in
Kazakhstan. This paper discusses the theoretical and methodological issues of compiling a …

被引用次数：10 相关文章所有 5 个版本

Metalanguage and knowledgebase for Kazakh morphology

G Yelibayeva, A Mukanova, A Sharipbay… - … Science and Its …, 2019 - Springer

Currently, the volume of various information resources in the Turkic languages is increasing.
Processing of such resources requires thesauri and corpora created using a single …

被引用次数：21 相关文章所有 3 个版本

[PDF] researchgate.net

[PDF][PDF] Syntactic annotation of kazakh: Following the universal dependencies guidelines. a report

A Makazhanov, A Sultangazina… - PROCEEDINGS OF …, 2015 - researchgate.net

The present work is a report on the authors' first attempt to use the universal dependencies
(UD)(de Marneffe et al., 2014) standard for syntactic annotation of Kazakh. The report is a …

被引用次数：30 相关文章所有 10 个版本