Deep learning-based morphological taggers and lemmatizers for annotating historical texts

H Schmid - Proceedings of the 3rd international conference on …, 2019 - dl.acm.org
Part-of-speech tagging, morphological tagging, and lemmatization of historical texts pose
special challenges due to the high spelling variability and the lack of large, high-quality …

[HTML][HTML] Text mining the history of medicine

P Thompson, RT Batista-Navarro, G Kontonatsios… - PloS one, 2016 - journals.plos.org
Historical text archives constitute a rich and diverse source of information, which is
becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it …

[图书][B] Historische formelhafte Sprache: theoretische Grundlagen und methodische Herausforderungen

N Filatkina - 2018 - books.google.com
Sprachen funktionieren durch das Zusammenwirken des Usuellen, des Formelhaften und
der Variation, die paradoxerweise gleichzeitig ein Indikator der Festigkeit und ihre treibende …

[PDF][PDF] A gold standard corpus of Early Modern German

S Scheible, RJ Whitt, M Durrell… - Proceedings of the 5th …, 2011 - aclanthology.org
This paper describes an annotated gold standard sample corpus of Early Modern German
containing over 50,000 tokens of text manually annotated with POS tags, lemmas, and …

[PDF][PDF] Extending the tool, or how to annotate historical language varieties

C Sánchez-Marco, G Boleda… - Proceedings of the 5th ACL …, 2011 - aclanthology.org
We present a general and simple method to adapt an existing NLP tool in order to enable it
to deal with historical varieties of languages. This approach consists basically in expanding …

[PDF][PDF] Evaluating an 'off-the-shelf'POS-tagger on Early Modern German text

S Scheible, RJ Whitt, M Durrell… - Proceedings of the 5th …, 2011 - aclanthology.org
The goal of this study is to evaluate an 'offthe-shelf'POS-tagger for modern German on
historical data from the Early Modern period (1650-1800). With no specialised tagger …

[PDF][PDF] Using comparable collections of historical texts for building a diachronic dictionary for spelling normalization

M Amoia, JM Martinez - Proceedings of the 7th workshop on …, 2013 - aclanthology.org
In this paper, we argue that comparable collections of historical written resources can help
overcoming typical challenges posed by heritage texts enhancing spelling normalization …

How to tag non-standard language: Normalisation versus domain adaptation for slovene historical and user-generated texts

K Zupan, N Ljubešić, T Erjavec - Natural Language Engineering, 2019 - cambridge.org
Part-of-speech (PoS) tagging of non-standard language with models developed for standard
language is known to suffer from a significant decrease in accuracy. Two methods are …

Part-of-speech in historical corpora: Tagger evaluation and ensemble systems on ARCHER

G Schneider, M Hundt, R Oppliger - 2016 - zora.uzh.ch
Tagger accuracy deteriorates when applied to texts different from the training corpus, eg with
respect to register or time period. On historical data, accuracy can drop to and below 90 …

Tracing the development of spanish participation constructions: an empirical study of semantic change

C Sánchez Marco - 2012 - tdx.cat
The main aim of this thesis is to trace the development of four different constructions
involving auxiliaries and participles through the history of the Spanish language. These …