POS tagging for historical texts with sparse training data

CP Chai - Natural Language Engineering, 2023 - cambridge.org

Text preprocessing is not only an essential step to prepare the corpus for modeling but also
a key area that directly affects the natural language processing (NLP) application results. For …

被引用次数：92 相关文章所有 4 个版本

[PDF] uni-muenchen.de

Deep learning-based morphological taggers and lemmatizers for annotating historical texts

H Schmid - Proceedings of the 3rd international conference on …, 2019 - dl.acm.org

Part-of-speech tagging, morphological tagging, and lemmatization of historical texts pose
special challenges due to the high spelling variability and the lack of large, high-quality …

被引用次数：63 相关文章所有 3 个版本

[HTML] plos.org

[HTML][HTML] Text mining the history of medicine

P Thompson, RT Batista-Navarro, G Kontonatsios… - PloS one, 2016 - journals.plos.org

Historical text archives constitute a rich and diverse source of information, which is
becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it …

被引用次数：80 相关文章所有 19 个版本

[PDF] arxiv.org

An evaluation of neural machine translation models on historical spelling normalization

G Tang, F Cap, E Pettersson, J Nivre - arXiv preprint arXiv:1806.05210, 2018 - arxiv.org

In this paper, we apply different NMT models to the problem of historical spelling
normalization for five languages: English, German, Hungarian, Icelandic, and Swedish. The …

被引用次数：53 相关文章所有 6 个版本

[PDF] aclanthology.org

[PDF][PDF] A multilingual evaluation of three spelling normalisation methods for historical text

E Pettersson, B Megyesi, J Nivre - Proceedings of the 8th …, 2014 - aclanthology.org

We present a multilingual evaluation of approaches for spelling normalisation of historical
text based on data from five languages: English, German, Hungarian, Icelandic, and …

被引用次数：56 相关文章所有 6 个版本

[PDF] diva-portal.org

Spelling normalisation and linguistic analysis of historical text for information extraction

E Pettersson - 2016 - diva-portal.org

Abstract Pettersson, E. 2016. Spelling Normalisation and Linguistic Analysis of Historical
Text for Information Extraction. Studia Linguistica Upsaliensia 17. 147 pp. Uppsala: Acta …

被引用次数：45 相关文章

[PDF] ruhr-uni-bochum.de

[PDF][PDF] Normalization of historical texts with neural network models

M Bollmann - 2018 - hss-opus.ub.ruhr-uni-bochum.de

With the increasing availability of digitized resources of historical documents, interest in
effective natural language processing (NLP) for these documents is on the rise. However …

被引用次数：31 相关文章所有 7 个版本

[PDF] arxiv.org

To normalize, or not to normalize: The impact of normalization on part-of-speech tagging

R Van der Goot, B Plank, M Nissim - arXiv preprint arXiv:1707.05116, 2017 - arxiv.org

Does normalization help Part-of-Speech (POS) tagging accuracy on noisy, non-canonical
data? To the best of our knowledge, little is known on the actual impact of normalization in a …

被引用次数：30 相关文章所有 11 个版本

[PDF] bollmann.me

Applying rule-based normalization to different types of historical texts—an evaluation

M Bollmann, F Petran, S Dipper - … 2011, Poznań, Poland, November 25--27 …, 2014 - Springer

This paper deals with normalization of language data from Early New High German. We
describe an unsupervised, rule-based approach which maps historical wordforms to modern …

被引用次数：22 相关文章所有 11 个版本

How to tag non-standard language: Normalisation versus domain adaptation for slovene historical and user-generated texts

K Zupan, N Ljubešić, T Erjavec - Natural Language Engineering, 2019 - cambridge.org

Part-of-speech (PoS) tagging of non-standard language with models developed for standard
language is known to suffer from a significant decrease in accuracy. Two methods are …

被引用次数：10 相关文章所有 3 个版本