UniMorph 4.0: universal morphology

K Batsuren, O Goldman, S Khalifa, N Habash… - arXiv preprint arXiv …, 2022 - arxiv.org
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-
coverage instantiated normalized morphological inflection tables for hundreds of diverse …

Resources for Turkish natural language processing: A critical survey

Ç Çöltekin, AS Doğruöz, Ö Çetinoğlu - Language Resources and …, 2023 - Springer
This paper presents a comprehensive survey of corpora and lexical resources available for
Turkish. We review a broad range of resources, focusing on the ones that are publicly …

[PDF][PDF] Universal Derivations 1.0, A Growing Collection of Harmonised Word-Formation Resources.

L Kyjánek, Z Zabokrtský, M Sevcikova… - Prague Bull. Math …, 2020 - academia.edu
The paper deals with harmonisation of existing data resources containing word-formation
features by converting them into a common file format and partially aligning their annotation …

[PDF][PDF] The Design of Croderiv 2.0.

M Filko, K Sojat, V Stefanec - Prague Bull. Math. Linguistics, 2020 - ufal.mff.cuni.cz
This paper deals with methods applied in the expansion and design of CroDeriv–the
Croatian derivational lexicon. The first version of the lexicon contained only verbs that were …

Transferring Word-Formation Networks Between Languages

J Vidra, Z Žabokrtský - The Prague Bulletin of Mathematical …, 2023 - publications.cuni.cz
We present a method for supervised cross-lingual construction of word-formation networks
(WFNs). WFNs are resources capturing derivational, compositional and other relations …

Harmonisation of language resources for word-formation of multiple languages

L Kyjánek - 2020 - dspace.cuni.cz
In the field of Natural Language Processing, word-formation is under-resourced comparing
to inflectional morphology. Moreover, the existing resources capturing word-formation differ …

Semi-supervised induction of morpheme boundaries in Czech using a word-formation network

J Bodnár, Z Žabokrtský, M Ševčíková - International Conference on Text …, 2020 - Springer
This paper deals with automatic morphological segmentation of Czech lemmas contained in
the word-formation network DeriNet. Capturing derivational relations between base and …

[PDF][PDF] Compound Splitting and Analysis for Russian

D Vodolazsky, H Petrov - Resources and Tools for Derivational …, 2021 - nabil.hathout.free.fr
First, we classify whether a word is a compound. We solve this task as a binary classification
problem with a character-level bidirectional LSTM with attention. More formally, given a …

Next Step in Online Querying and Visualization of Word-Formation Networks

J Vidra, Z Žabokrtský - Text, Speech, and Dialogue: 23rd International …, 2020 - Springer
In this paper, we introduce a new and improved version of DeriSearch, a search engine and
visualizer for word-formation networks. Word-formation networks are datasets that express …

[PDF][PDF] Morphological Networks for Persian and Turkish: What Can Be Induced from Morpheme Segmentation?

H Haghdoost, E Ansari, Z Zabokrtský… - Prague Bull. Math …, 2020 - ufal.mff.cuni.cz
In this work, we propose an algorithm that induces morphological networks for Persian and
Turkish. The algorithm uses morpheme-segmented lexicons for the two languages. The …