A comprehensive survey on word representation models: From classical to state-of-the-art word representation language models

U Naseem, I Razzak, SK Khan, M Prasad - Transactions on Asian and …, 2021 - dl.acm.org
Word representation has always been an important research area in the history of natural
language processing (NLP). Understanding such complex text data is imperative, given that …

[PDF][PDF] What to do about bad language on the internet

J Eisenstein - Proceedings of the 2013 conference of the North …, 2013 - aclanthology.org
The rise of social media has brought computational linguistics in ever-closer contact with
bad language: text that defies our expectations about vocabulary, spelling, and syntax. This …

Argumentation mining in user-generated web discourse

I Habernal, I Gurevych - Computational linguistics, 2017 - direct.mit.edu
The goal of argumentation mining, an evolving research field in computational linguistics, is
to design methods capable of analyzing people's argumentation. In this article, we go …

[图书][B] Natural language processing for social media

A Farzindar, D Inkpen, G Hirst - 2015 - Springer
In recent years, online social networking has revolutionized interpersonal communication.
The newer research on language analysis in social media has been increasingly focusing …

[PDF][PDF] How noisy social media text, how diffrnt social media sources?

T Baldwin, P Cook, M Lui, A MacKinlay… - Proceedings of the sixth …, 2013 - aclanthology.org
While various claims have been made about text in social media text being noisy, there has
never been a systematic study to investigate just how linguistically noisy or otherwise it is …

A dependency parser for tweets

L Kong, N Schneider, S Swayamdipta… - Proceedings of the …, 2014 - hub.hku.hk
© 2014 Association for Computational Linguistics. We describe a new dependency parser
for English tweets, TWEEBOPARSER. The parser builds on several contributions: new …

What to do about non-standard (or non-canonical) language in NLP

B Plank - arXiv preprint arXiv:1608.07836, 2016 - arxiv.org
Real world data differs radically from the benchmark corpora we use in natural language
processing (NLP). As soon as we apply our technologies to the real world, performance …

[HTML][HTML] Building the essential resources for Finnish: the Turku Dependency Treebank

K Haverinen, J Nyblom, T Viljanen, V Laippala… - Language Resources …, 2014 - Springer
In this paper, we present the final version of a publicly available treebank of Finnish, the
Turku Dependency Treebank. The treebank contains 204,399 tokens (15,126 sentences) …

Systematic patterning in phonologically‐motivated orthographic variation

J Eisenstein - Journal of Sociolinguistics, 2015 - Wiley Online Library
Social media features a wide range of non‐standard spellings, many of which appear
inspired by phonological variation. However, the nature of the connection between variation …

[PDF][PDF] Learning part-of-speech taggers with inter-annotator agreement loss

B Plank, D Hovy, A Sogaard - Proceedings of EACL, 2014 - iris.unibocconi.it
In natural language processing (NLP) annotation projects, we use inter-annotator
agreement measures and annotation guidelines to ensure consistent annotations. However …