[PDF][PDF] Open-source tools for morphology, lemmatization, POS tagging and named entity recognition

J Straková, M Straka, J Hajic - … of 52nd annual meeting of the …, 2014 - aclanthology.org
Proceedings of 52nd annual meeting of the Association for …, 2014aclanthology.org
We present two recently released opensource taggers: NameTag is a free software for
named entity recognition (NER) which achieves state-of-the-art performance on Czech;
MorphoDiTa (Morphological Dictionary and Tagger) performs morphological analysis (with
lemmatization), morphological generation, tagging and tokenization with state-of-the-art
results for Czech and a throughput around 10-200K words per second. The taggers can be
trained for any language for which annotated data exist, but they are specifically designed to …
Abstract
We present two recently released opensource taggers: NameTag is a free software for named entity recognition (NER) which achieves state-of-the-art performance on Czech; MorphoDiTa (Morphological Dictionary and Tagger) performs morphological analysis (with lemmatization), morphological generation, tagging and tokenization with state-of-the-art results for Czech and a throughput around 10-200K words per second. The taggers can be trained for any language for which annotated data exist, but they are specifically designed to be efficient for inflective languages, Both tools are free software under LGPL license and are distributed along with trained linguistic models which are free for non-commercial use under the CC BY-NC-SA license. The releases include standalone tools, C++ libraries with Java, Python and Perl bindings and web services.
aclanthology.org
以上显示的是最相近的搜索结果。 查看全部搜索结果