Dataset of stopwords extracted from Uzbek texts

[HTML][HTML] Developing named entity recognition algorithms for Uzbek: Dataset Insights and Implementation

D Mengliev, V Barakhnin, N Abdurakhmonova… - Data in Brief, 2024 - Elsevier

This paper presents a dataset and approaches to named entity recognition (NLP) in Uzbek
language, in a resource-constrained language environment. Despite the increase in NLP …

被引用次数：34 相关文章所有 9 个版本

[PDF] arxiv.org

Uzbek text's correspondence with the educational potential of pupils: a case study of the School corpus

K Madatov, S Matlatipov, M Aripov - arXiv preprint arXiv:2303.00465, 2023 - arxiv.org

One of the major challenges of an educational system is choosing appropriate content
considering pupils' age and intellectual potential. In this article the experiment of primary …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

Uzbek text summarization based on TF-IDF

K Madatov, S Bekchanov, J Vičič - arXiv preprint arXiv:2303.00461, 2023 - arxiv.org

The volume of information is increasing at an incredible rate with the rapid development of
the Internet and electronic information services. Due to time constraints, we don't have the …

被引用次数：10 相关文章所有 3 个版本

[PDF] preprints.org

Automatic detection of stop words for texts in the Uzbek language

K Madatov, S Bekchanov, J Vičič - 2022 - preprints.org

Stop words are very important for information retrieval and text analysis investigation. This
study aimed to automatically analyze and detect stop words in texts in the Uzbek language …

被引用次数：16 相关文章所有 10 个版本

[HTML] sciencedirect.com

[HTML][HTML] Dataset of Karakalpak language stop words

K Madatov, S Bekchanov, J Vičič - Data in Brief, 2023 - Elsevier

The dataset presented in this paper aims to address the challenge of automatic extraction of
stop words in Natural Language Processing (NLP) for the low-resource Karakalpak …

被引用次数：5 相关文章所有 8 个版本

Building a Comprehensive Uzbek Lexicon: Bridging Dialects for Text Standardization

DB Mengliev, NZ Abdurakhmonova… - 2024 IEEE 25th …, 2024 - ieeexplore.ieee.org

As part of the study, the authors developed a dictionary of the formal Uzbek language and its
dialects, which can be used in the tasks of standardizing mixed texts in various dialects of …

[HTML] cyberleninka.ru

[PDF] trilogi.ac.id

Automating the Extraction of Words and Topics in Indonesian Using the Term Frequency-Inverse Document Frequency Algorithm and Latent Dirichlet Allocation

L Mutawalli, MTA Zaen, MF Zulkarnaen - JISA (Jurnal Informatika dan …, 2024 - trilogi.ac.id

Keyword extraction and topic modeling in the analysis of Gojek user reviews in Indonesian
are very important. By understanding user preferences and needs through keyword …