[PDF][PDF] MultiLexNorm: A shared task on multilingual lexical normalization

R Van Der Goot, A Ramponi, A Zubiaga… - Seventh Workshop on …, 2021 - pure.itu.dk
Lexical normalization is the task of transforming an utterance into its standardized form. This
task is beneficial for downstream analysis, as it provides a way to harmonize (often …

[HTML][HTML] Graph-based Turkish text normalization and its impact on noisy text processing

S Demir, B Topcu - Engineering Science and Technology, an International …, 2022 - Elsevier
User generated texts on the web are freely-available and lucrative sources of data for
language technology researchers. Unfortunately, these texts are often dominated by …

MoNoise: A multi-lingual and easy-to-use lexical normalization tool

R Van Der Goot - Proceedings of the 57th Annual Meeting of the …, 2019 - aclanthology.org
In this paper, we introduce and demonstrate the online demo as well as the command line
interface of a lexical normalization system (MoNoise) for a variety of languages. We further …

Do word embeddings capture spelling variation?

D Nguyen, J Grieve - … of the 28th International Conference on …, 2020 - aclanthology.org
Analyses of word embeddings have primarily focused on semantic and syntactic properties.
However, word embeddings have the potential to encode other properties as well. In this …

Lexical normalization using generative transformer model (LN-GTM)

M Ashmawy, MW Fakhr, FA Maghraby - International Journal of …, 2023 - Springer
Lexical Normalization (LN) aims to normalize a nonstandard text to a standard text. This
problem is of extreme importance in natural language processing (NLP) when applying …

Synthetic Data for English Lexical Normalization: How Close Can We Get to Manually Annotated Data?

K Dekker, R Van Der Goot - Proceedings of the Twelfth Language …, 2020 - aclanthology.org
Social media is a valuable data resource for various natural language processing (NLP)
tasks. However, standard NLP tools were often designed with standard texts in mind, and …

ViLexNorm: A Lexical Normalization Corpus for Vietnamese Social Media Text

TN Nguyen, TP Le, K Van Nguyen - arXiv preprint arXiv:2401.16403, 2024 - arxiv.org
Lexical normalization, a fundamental task in Natural Language Processing (NLP), involves
the transformation of words into their canonical forms. This process has been proven to …

A survey of recent error annotation schemes for automatically generated text

R Huidrom, A Belz - Proceedings of the 2nd Workshop on Natural …, 2022 - aclanthology.org
While automatically computing numerical scores remains the dominant paradigm in NLP
system evaluation, error analysis is receiving increasing attention, with numerous error …

hinglishNorm--A Corpus of Hindi-English Code Mixed Sentences for Text Normalization

P Makhija, A Kumar, A Gupta - arXiv preprint arXiv:2010.08974, 2020 - arxiv.org
We present hinglishNorm--a human annotated corpus of Hindi-English code-mixed
sentences for text normalization task. Each sentence in the corpus is aligned to its …

Normalization and parsing algorithms for uncertain input

RM van der Goot - 2019 - research.rug.nl
The automatic analysis (parsing) of natural language is an important ingredient for many
natural language processing applications (search-engines, automatic translation, speech …