[PDF][PDF] MultiLexNorm: A shared task on multilingual lexical normalization
Lexical normalization is the task of transforming an utterance into its standardized form. This
task is beneficial for downstream analysis, as it provides a way to harmonize (often …
task is beneficial for downstream analysis, as it provides a way to harmonize (often …
[HTML][HTML] Graph-based Turkish text normalization and its impact on noisy text processing
User generated texts on the web are freely-available and lucrative sources of data for
language technology researchers. Unfortunately, these texts are often dominated by …
language technology researchers. Unfortunately, these texts are often dominated by …
MoNoise: A multi-lingual and easy-to-use lexical normalization tool
R Van Der Goot - Proceedings of the 57th Annual Meeting of the …, 2019 - aclanthology.org
In this paper, we introduce and demonstrate the online demo as well as the command line
interface of a lexical normalization system (MoNoise) for a variety of languages. We further …
interface of a lexical normalization system (MoNoise) for a variety of languages. We further …
Do word embeddings capture spelling variation?
Analyses of word embeddings have primarily focused on semantic and syntactic properties.
However, word embeddings have the potential to encode other properties as well. In this …
However, word embeddings have the potential to encode other properties as well. In this …
Lexical normalization using generative transformer model (LN-GTM)
M Ashmawy, MW Fakhr, FA Maghraby - International Journal of …, 2023 - Springer
Lexical Normalization (LN) aims to normalize a nonstandard text to a standard text. This
problem is of extreme importance in natural language processing (NLP) when applying …
problem is of extreme importance in natural language processing (NLP) when applying …
Synthetic Data for English Lexical Normalization: How Close Can We Get to Manually Annotated Data?
K Dekker, R Van Der Goot - Proceedings of the Twelfth Language …, 2020 - aclanthology.org
Social media is a valuable data resource for various natural language processing (NLP)
tasks. However, standard NLP tools were often designed with standard texts in mind, and …
tasks. However, standard NLP tools were often designed with standard texts in mind, and …
ViLexNorm: A Lexical Normalization Corpus for Vietnamese Social Media Text
Lexical normalization, a fundamental task in Natural Language Processing (NLP), involves
the transformation of words into their canonical forms. This process has been proven to …
the transformation of words into their canonical forms. This process has been proven to …
A survey of recent error annotation schemes for automatically generated text
While automatically computing numerical scores remains the dominant paradigm in NLP
system evaluation, error analysis is receiving increasing attention, with numerous error …
system evaluation, error analysis is receiving increasing attention, with numerous error …
hinglishNorm--A Corpus of Hindi-English Code Mixed Sentences for Text Normalization
We present hinglishNorm--a human annotated corpus of Hindi-English code-mixed
sentences for text normalization task. Each sentence in the corpus is aligned to its …
sentences for text normalization task. Each sentence in the corpus is aligned to its …
Normalization and parsing algorithms for uncertain input
RM van der Goot - 2019 - research.rug.nl
The automatic analysis (parsing) of natural language is an important ingredient for many
natural language processing applications (search-engines, automatic translation, speech …
natural language processing applications (search-engines, automatic translation, speech …