Self-supervised speech representation learning: A review

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

[HTML][HTML] Unsupervised automatic speech recognition: A review

H Aldarmaki, A Ullah, S Ram, N Zaki - Speech Communication, 2022 - Elsevier
Abstract Automatic Speech Recognition (ASR) systems can be trained to achieve
remarkable performance given large amounts of manually transcribed speech, but large …

Effectiveness of self-supervised pre-training for speech recognition

A Baevski, M Auli, A Mohamed - arXiv preprint arXiv:1911.03912, 2019 - arxiv.org
We compare self-supervised representation learning algorithms which either explicitly
quantize the audio data or learn representations without quantization. We find the former to …

The zero resource speech challenge 2017

E Dunbar, XN Cao, J Benjumea… - 2017 IEEE Automatic …, 2017 - ieeexplore.ieee.org
We describe a new challenge aimed at discovering subword and word units from raw
speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It …

Effectiveness of self-supervised pre-training for asr

A Baevski, A Mohamed - ICASSP 2020-2020 IEEE International …, 2020 - ieeexplore.ieee.org
We compare self-supervised representation learning algorithms which either explicitly
quantize the audio data or learn representations without quantization. We find the former to …

Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner

E Dupoux - Cognition, 2018 - Elsevier
Spectacular progress in the information processing sciences (machine learning, wearable
sensors) promises to revolutionize the study of cognitive development. Here, we analyse the …

Evaluating speech features with the minimal-pair ABX task: Analysis of the classical MFC/PLP pipeline

T Schatz, V Peddinti, F Bach, A Jansen… - … 2013: 14th Annual …, 2013 - hal.science
We present a new framework for the evaluation of speech rep-resentations in zero-resource
settings, that extends and complements previous work by Carlin, Jansen and Hermansky [1] …

A segmental framework for fully-unsupervised large-vocabulary speech recognition

H Kamper, A Jansen, S Goldwater - Computer Speech & Language, 2017 - Elsevier
Zero-resource speech technology is a growing research area that aims to develop methods
for speech processing in the absence of transcriptions, lexicons, or language modelling text …

Self-supervised language learning from raw audio: Lessons from the zero resource speech challenge

E Dunbar, N Hamilakis… - IEEE Journal of Selected …, 2022 - ieeexplore.ieee.org
Recent progress in self-supervised or unsupervised machine learning has opened the
possibility of building a full speech processing system from raw audio without using any …

Word segmentation on discovered phone units with dynamic programming and self-supervised scoring

H Kamper - IEEE/ACM Transactions on Audio, Speech, and …, 2022 - ieeexplore.ieee.org
Recent work on unsupervised speech segmentation has used self-supervised models with
phone and word segmentation modules that are trained jointly. This paper instead revisits …