Low-resource keyword search strategies for Tamil

NF Chen, C Ni, IF Chen, S Sivadas… - … , Speech and Signal …, 2015 - ieeexplore.ieee.org
We propose strategies for a state-of-the-art keyword search (KWS) system developed by the
SINGA team in the context of the 2014 NIST Open Keyword Search Evaluation …

Unsupervised data selection and word-morph mixed language model for tamil low-resource keyword search

C Ni, CC Leung, L Wang, NF Chen… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
This paper considers an unsupervised data selection problem for the training data of an
acoustic model and the vocabulary coverage of a keyword search system in low-resource …

[PDF][PDF] Comparing decoding strategies for subword-based keyword spotting in low-resourced languages

W Hartmann, VB Le, A Messaoudi, L Lamel… - … annual conference of …, 2014 - isca-archive.org
For languages with limited training resources, out-ofvocabulary (OOV) words are a
significant problem, both for transcription and keyword spotting. This paper investigates the …

Conversational telephone speech recognition for Lithuanian

R Lileikytė, L Lamel, JL Gauvain, A Gorin - Computer Speech & Language, 2018 - Elsevier
The research presented in the paper addresses conversational telephone speech
recognition and keyword spotting for the Lithuanian language. Lithuanian can be …

Hybrid sub-word segmentation for handling long tail in morphologically rich low resource languages

S Manghat, S Manghat, T Schultz - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Dealing with Out Of Vocabulary (OOV) words or unseen words is one of the main issues of
Machine Translation (MT) as well as automatic speech recognition (ASR) systems. For …

Using pronunciation-based morphological subword units to improve OOV handling in keyword search

Y He, P Baumann, H Fang… - … on Audio, Speech …, 2015 - ieeexplore.ieee.org
Out-of-vocabulary (OOV) keywords present a challenge for keyword search (KWS) systems
especially in the low-resource setting. Previous research has centered around approaches …

A comparison of methods for oov-word recognition on a new public dataset

RA Braun, S Madikeri, P Motlicek - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
A common problem for automatic speech recognition systems is how to recognize words that
they did not see during training. Currently there is no established method of evaluating …

An open vocabulary OCR system with hybrid word-subword language models

M Cai, W Hu, K Chen, L Sun, S Liang… - 2017 14th IAPR …, 2017 - ieeexplore.ieee.org
The accuracy of a typical state-of-the-art optical character recognition (OCR) system benefits
greatly from using a language model (LM). However, a conventional LM has a limited …

Exponential language modeling using morphological features and multi-task learning

H Fang, M Ostendorf, P Baumann… - … /ACM Transactions on …, 2015 - ieeexplore.ieee.org
For languages with fast vocabulary growth and limited resources, data sparsity leads to
challenges in training a language model. One strategy for addressing this problem is to …

[PDF][PDF] Comparison of Multiple System Combination Techniques for Keyword Spotting.

W Hartmann, Le Zhang 0002, K Barnes, R Hsiao… - Interspeech, 2016 - researchgate.net
Abstract System combination is a common approach to improving results for both speech
transcription and keyword spotting—especially in the context of low-resourced languages …