Tagging the Bard: Evaluating the accuracy of a modern POS tagger on Early Modern English corpora

T McEnery, A Hardie - 2011 - books.google.com

Corpus linguistics is the study of language data on a large scale-the computer-aided
analysis of very extensive collections of transcribed utterances or written texts. This textbook …

被引用次数：5704 相关文章所有 5 个版本

[PDF] arxiv.org

Shakespearizing modern language using copy-enriched sequence-to-sequence models

H Jhamtani, V Gangal, E Hovy, E Nyberg - arXiv preprint arXiv:1707.01161, 2017 - arxiv.org

Variations in writing styles are commonly used to adapt the content to a specific context,
audience, or purpose. However, applying stylistic variations is still by and large a manual …

被引用次数：219 相关文章所有 5 个版本

[PDF] researchgate.net

[PDF][PDF] VARD2: A tool for dealing with spelling variation in historical corpora

A Baron, P Rayson - 2008 - researchgate.net

When applying corpus linguistic techniques to historical corpora, the corpus researcher
should be cautious about the results obtained. Corpus annotation techniques such as part of …

被引用次数：258 相关文章所有 4 个版本

[PDF] academia.edu

[PDF][PDF] Word frequency and key word statistics in historical corpus linguistics

A Baron, P Rayson, D Archer - Anglistik: International Journal of …, 2009 - academia.edu

Frequency-sorted word lists have long been part of the standard methodology for exploiting
corpora. Sinclair (1991: 30) noted that" anyone studying a text is likely to need to know how …

被引用次数：201 相关文章所有 5 个版本

[图书][B] Contemporary corpus linguistics

P Baker - 2012 - books.google.com

Corpus linguistics uses large electronic databases of language to examine hypotheses
about language use. These can be tested scientifically with computerised analytical tools …

被引用次数：170 相关文章所有 4 个版本

[图书][B] Corpus linguistics for online communication: A guide for research

L Collins - 2019 - taylorfrancis.com

Corpus Linguistics for Online Communication provides an instructive and practical guide to
conducting research using methods in corpus linguistics in studies of various forms of online …

被引用次数：61 相关文章所有 4 个版本

[PDF] uni-muenchen.de

Deep learning-based morphological taggers and lemmatizers for annotating historical texts

H Schmid - Proceedings of the 3rd international conference on …, 2019 - dl.acm.org

Part-of-speech tagging, morphological tagging, and lemmatization of historical texts pose
special challenges due to the high spelling variability and the lack of large, high-quality …

被引用次数：57 相关文章所有 3 个版本

[HTML] springer.com

[HTML][HTML] The electronic corpus of 17th-and 18th-century polish texts

W Gruszczyński, D Adamiec, R Bronikowska… - Language Resources …, 2022 - Springer

The paper describes the process of building the electronic corpus of 17th-and 18th-century
Polish texts, a relatively large, balanced, structurally and morphologically annotated …

被引用次数：15 相关文章所有 6 个版本

[PDF] sciendo.com

Guidelines for normalising Early Modern English corpora: Decisions and justifications

D Archer, M Kytö, A Baron, P Rayson - icame Journal, 2015 - sciendo.com

Abstract Corpora of Early Modern English have been collected and released for research for
a number of years. With large scale digitisation activities gathering pace in the last decade …

被引用次数：54 相关文章所有 4 个版本

Creation of an annotated corpus of Old and Middle Hungarian court records and private correspondence

A Novák, K Gugán, M Varga, A Dömötör - Language Resources and …, 2018 - Springer

The paper introduces a novel annotated corpus of Old and Middle Hungarian (16–18
century), the texts of which were selected in order to approximate the vernacular of the given …

被引用次数：41 相关文章所有 5 个版本