[HTML][HTML] Unsupervised automatic speech recognition: A review

H Aldarmaki, A Ullah, S Ram, N Zaki - Speech Communication, 2022 - Elsevier
Abstract Automatic Speech Recognition (ASR) systems can be trained to achieve
remarkable performance given large amounts of manually transcribed speech, but large …

Unsupervised neural network based feature extraction using weak top-down constraints

H Kamper, M Elsner, A Jansen… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
Deep neural networks (DNNs) have become a standard component in supervised ASR,
used in both data-driven feature extraction and acoustic modelling. Supervision is typically …

Spoken content retrieval—beyond cascading speech recognition with text retrieval

L Lee, J Glass, H Lee, C Chan - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org
Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …

Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings

K Levin, K Henry, A Jansen… - 2013 IEEE workshop on …, 2013 - ieeexplore.ieee.org
Measures of acoustic similarity between words or other units are critical for segmental
exemplar-based acoustic models, spoken term discovery, and query-by-example search …

Discriminative acoustic word embeddings: Tecurrent neural network-based approaches

S Settle, K Livescu - 2016 IEEE Spoken Language Technology …, 2016 - ieeexplore.ieee.org
Acoustic word embeddings-fixed-dimensional vector representations of variable-length
spoken word segments-have begun to be considered for tasks such as speech recognition …

Multilingual representations for low resource speech recognition and keyword search

J Cui, B Kingsbury, B Ramabhadran… - 2015 IEEE workshop …, 2015 - ieeexplore.ieee.org
This paper examines the impact of multilingual (ML) acoustic representations on Automatic
Speech Recognition (ASR) and keyword search (KWS) for low resource languages in the …

Unsupervised word segmentation and lexicon discovery using acoustic word embeddings

H Kamper, A Jansen… - IEEE/ACM Transactions on …, 2016 - ieeexplore.ieee.org
In settings where only unlabeled speech data is available, speech technology needs to be
developed without transcriptions, pronunciation dictionaries, or language modelling text. A …

High-performance query-by-example spoken term detection on the SWS 2013 evaluation

LJ Rodriguez-Fuentes, A Varona… - … , Speech and Signal …, 2014 - ieeexplore.ieee.org
In the last years, the task of Query-by-Example Spoken Term Detection (QbE-STD), which
aims to find occurrences of a spoken query in a set of audio documents, has gained the …

Acoustic segment modeling with spectral clustering methods

H Wang, T Lee, CC Leung, B Ma… - IEEE/ACM Transactions …, 2015 - ieeexplore.ieee.org
This paper presents a study of spectral clustering-based approaches to acoustic segment
modeling (ASM). ASM aims at finding the underlying phoneme-like speech units and …

Query-by-example spoken term detection using frequency domain linear prediction and non-segmental dynamic time warping

G Mantena, S Achanta… - IEEE/ACM Transactions on …, 2014 - ieeexplore.ieee.org
The task of query-by-example spoken term detection (QbE-STD) is to find a spoken query
within spoken audio data. Current state-of-the-art techniques assume zero prior knowledge …