[PDF][PDF] The tagged Icelandic corpus (MÍM)
S Helgadóttir, Á Svavarsdóttir… - Proceedings of the …, 2012 - academia.edu
In this paper, we describe the development of a morphosyntactically tagged corpus of
Icelandic, the MÍM corpus. The corpus consists of about 25 million tokens of contemporary …
Icelandic, the MÍM corpus. The corpus consists of about 25 million tokens of contemporary …
[PDF][PDF] Almannaromur: An open icelandic speech corpus
J Guðnason, O Kjartansson, J Jóhannsson… - … for Under-Resourced …, 2012 - isca-archive.org
The purpose of the Almannarómur project is collecting data for a speech corpus (database)
for Icelandic. Its main aim is creating an open source speech project to enable research and …
for Icelandic. Its main aim is creating an open source speech project to enable research and …
Dealing with ambiguity in NLP: finding the best tree in the parse forest
RB Baldursson - 2023 - skemman.is
Context-free grammars (CFGs) are not typically used to parse natural languages, whereas
they are commonly used to parse programming languages. In the latter case, the CFG …
they are commonly used to parse programming languages. In the latter case, the CFG …
[PDF][PDF] Lexicon Acquisition through Noun Clustering
AB Nikulásdóttir, M Whelpton - LexicoNordica, 2010 - tidsskrift.dk
This paper describes an experiment with clustering of Icelandic nouns based on semantic
relatedness. This work is part of a larger project aiming at semi-automatically constructing a …
relatedness. This work is part of a larger project aiming at semi-automatically constructing a …
From human-oriented dictonaries to computer-oriented lexical resources-trying to pin down words
M Whelpton - Orð og tunga, 2012 - ordogtunga.arnastofnun.is
Dictionaries are designed for the human user; electronic lexical resources are often
designed with computers in mind: to represent information about the form, use and meaning …
designed with computers in mind: to represent information about the form, use and meaning …
[PDF][PDF] Icelandic language technology: an overview
E Rögnvaldsson - Language, Languages and New Technologies: ICT …, 2010 - efnil.nytud.hu
We describe the establishment and development of Icelandic language technology since its
very beginning ten years ago. The ground was laid with a report from an Expert Group …
very beginning ten years ago. The ground was laid with a report from an Expert Group …
Samba: Automatic identification of verbal expressions in Icelandic
K Rúnarsson - 2017 - skemman.is
This thesis discusses the development of Samba, a software solution designed to identify
known verbal expressions in PoS-tagged and lemmatized text. Samba uses a database of …
known verbal expressions in PoS-tagged and lemmatized text. Samba uses a database of …