Strong baselines for neural semi-supervised learning under domain shift
Novel neural models have been proposed in recent years for learning under domain shift.
Most models, however, only evaluate on a single task, on proprietary datasets, or compare …
Most models, however, only evaluate on a single task, on proprietary datasets, or compare …
[PDF][PDF] MultiLexNorm: A shared task on multilingual lexical normalization
Lexical normalization is the task of transforming an utterance into its standardized form. This
task is beneficial for downstream analysis, as it provides a way to harmonize (often …
task is beneficial for downstream analysis, as it provides a way to harmonize (often …
Enhancing BERT for lexical normalization
Language model-based pre-trained representations have become ubiquitous in natural
language processing. They have been shown to significantly improve the performance of …
language processing. They have been shown to significantly improve the performance of …
[HTML][HTML] Graph-based Turkish text normalization and its impact on noisy text processing
User generated texts on the web are freely-available and lucrative sources of data for
language technology researchers. Unfortunately, these texts are often dominated by …
language technology researchers. Unfortunately, these texts are often dominated by …
[PDF][PDF] Dialect text normalization to normative standard Finnish
N Partanen, M Hämäläinen… - Workshop on Noisy …, 2019 - researchportal.helsinki.fi
We compare different LSTMs and transformer models in terms of their effectiveness in
normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common …
normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common …
Monoise: Modeling noise using a modular normalization system
R van der Goot, G van Noord - arXiv preprint arXiv:1710.03476, 2017 - arxiv.org
We propose MoNoise: a normalization model focused on generalizability and efficiency, it
aims at being easily reusable and adaptable. Normalization is the task of translating texts …
aims at being easily reusable and adaptable. Normalization is the task of translating texts …
[PDF][PDF] Rule-based text normalization for Malay social media texts
SNAN Ariffin, S Tiun - International Journal of Advanced …, 2020 - pdfs.semanticscholar.org
Malay social media text is a text written on social media networks like Twitter. Commonly,
this text comprises nonstandard words, filled with dialects, foreign languages, word …
this text comprises nonstandard words, filled with dialects, foreign languages, word …
Noise-robust morphological disambiguation for dialectal Arabic
User-generated text tends to be noisy with many lexical and orthographic inconsistencies,
making natural language processing (NLP) tasks more challenging. The challenging nature …
making natural language processing (NLP) tasks more challenging. The challenging nature …
Lexical normalization for code-switched data and its effect on POS-tagging
R Van Der Goot, Ö Çetinoğlu - arXiv preprint arXiv:2006.01175, 2020 - arxiv.org
Lexical normalization, the translation of non-canonical data to standard language, has
shown to improve the performance of manynatural language processing tasks on social …
shown to improve the performance of manynatural language processing tasks on social …
Annotating Norwegian language varieties on Twitter for part-of-speech
Norwegian Twitter data poses an interesting challenge for Natural Language Processing
(NLP) tasks. These texts are difficult for models trained on standardized text in one of the two …
(NLP) tasks. These texts are difficult for models trained on standardized text in one of the two …