Low-resource keyword search strategies for Tamil
We propose strategies for a state-of-the-art keyword search (KWS) system developed by the
SINGA team in the context of the 2014 NIST Open Keyword Search Evaluation …
SINGA team in the context of the 2014 NIST Open Keyword Search Evaluation …
Unsupervised data selection and word-morph mixed language model for tamil low-resource keyword search
This paper considers an unsupervised data selection problem for the training data of an
acoustic model and the vocabulary coverage of a keyword search system in low-resource …
acoustic model and the vocabulary coverage of a keyword search system in low-resource …
[PDF][PDF] Comparing decoding strategies for subword-based keyword spotting in low-resourced languages
For languages with limited training resources, out-ofvocabulary (OOV) words are a
significant problem, both for transcription and keyword spotting. This paper investigates the …
significant problem, both for transcription and keyword spotting. This paper investigates the …
Conversational telephone speech recognition for Lithuanian
The research presented in the paper addresses conversational telephone speech
recognition and keyword spotting for the Lithuanian language. Lithuanian can be …
recognition and keyword spotting for the Lithuanian language. Lithuanian can be …
Hybrid sub-word segmentation for handling long tail in morphologically rich low resource languages
Dealing with Out Of Vocabulary (OOV) words or unseen words is one of the main issues of
Machine Translation (MT) as well as automatic speech recognition (ASR) systems. For …
Machine Translation (MT) as well as automatic speech recognition (ASR) systems. For …
Using pronunciation-based morphological subword units to improve OOV handling in keyword search
Out-of-vocabulary (OOV) keywords present a challenge for keyword search (KWS) systems
especially in the low-resource setting. Previous research has centered around approaches …
especially in the low-resource setting. Previous research has centered around approaches …
A comparison of methods for oov-word recognition on a new public dataset
A common problem for automatic speech recognition systems is how to recognize words that
they did not see during training. Currently there is no established method of evaluating …
they did not see during training. Currently there is no established method of evaluating …
An open vocabulary OCR system with hybrid word-subword language models
The accuracy of a typical state-of-the-art optical character recognition (OCR) system benefits
greatly from using a language model (LM). However, a conventional LM has a limited …
greatly from using a language model (LM). However, a conventional LM has a limited …
Exponential language modeling using morphological features and multi-task learning
For languages with fast vocabulary growth and limited resources, data sparsity leads to
challenges in training a language model. One strategy for addressing this problem is to …
challenges in training a language model. One strategy for addressing this problem is to …
[PDF][PDF] Comparison of Multiple System Combination Techniques for Keyword Spotting.
Abstract System combination is a common approach to improving results for both speech
transcription and keyword spotting—especially in the context of low-resourced languages …
transcription and keyword spotting—especially in the context of low-resourced languages …