Contemporary approaches in evolving language models
This article provides a comprehensive survey of contemporary language modeling
approaches within the realm of natural language processing (NLP) tasks. This paper …
approaches within the realm of natural language processing (NLP) tasks. This paper …
[PDF][PDF] Advancing neural language modeling in automatic speech recognition.
K Irie - 2020 - publications.rwth-aachen.de
Statistical language modeling is one of the fundamental problems in natural language
processing. In the recent years, language modeling has seen great advances by active …
processing. In the recent years, language modeling has seen great advances by active …
N-gram Based Croatian Language Network: Application in a Smart Environment
Sažetak In the field of natural language processing, language networks represent a method
for observing linguistic units and their interactions in different linguistic contexts. This paper …
for observing linguistic units and their interactions in different linguistic contexts. This paper …
Augmented-syllabification of n-gram tagger for Indonesian words and named-entities
As one of the statistical-based models, an n-gram syllabification commonly gives a high
syllable error rate (SER) for Bahasa Indonesia, one of the low-resource languages, since it …
syllable error rate (SER) for Bahasa Indonesia, one of the low-resource languages, since it …
Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation
Sampling-based decoding strategies have been widely adopted for Large Language
Models (LLMs) in numerous applications, which target a balance between diversity and …
Models (LLMs) in numerous applications, which target a balance between diversity and …
Sleep Model: A Sequence Model for Predicting the Next Sleep Stage
As sleep disorders are becoming more prevalent there is an urgent need to classify sleep
stages in a less disturbing way. In particular, sleep-stage classification using simple sensors …
stages in a less disturbing way. In particular, sleep-stage classification using simple sensors …
Data augmentation methods for low-resource orthographic syllabification
An n-gram syllabification model generally produces a high error rate for a low-resource
language, such as Indonesian, because of the high rate of out-of-vocabulary (OOV) n-grams …
language, such as Indonesian, because of the high rate of out-of-vocabulary (OOV) n-grams …
Improving low compute language modeling with in-domain embedding initialisation
Many NLP applications, such as biomedical data and technical support, have 10-100 million
tokens of in-domain data and limited computational resources for learning from it. How …
tokens of in-domain data and limited computational resources for learning from it. How …
Indonesian Graphemic Syllabification Using n-Gram Tagger with State-Elimination
RN Ismail, S Suyanto - 2020 8th International Conference on …, 2020 - ieeexplore.ieee.org
Syllabification can be approached using either grapheme or phoneme-based. Graphemic
syllabification is simpler than phonemic syllabification since it does not require grapheme-to …
syllabification is simpler than phonemic syllabification since it does not require grapheme-to …
Efficient MDI adaptation for n-gram language models
This paper presents an efficient algorithm for n-gram language model adaptation under the
minimum discrimination information (MDI) principle, where an out-of-domain language …
minimum discrimination information (MDI) principle, where an out-of-domain language …