Self-supervised contrastive learning for unsupervised phoneme segmentation

F Kreuk, J Keshet, Y Adi - arXiv preprint arXiv:2007.13465, 2020 - arxiv.org
We propose a self-supervised representation learning model for the task of unsupervised
phoneme boundary detection. The model is a convolutional neural network that operates …

Computational modeling of phonetic and lexical learning in early language acquisition: Existing models and future directions

O Räsänen - Speech Communication, 2012 - Elsevier
This work reviews a number of existing computational studies concentrated on the question
of how spoken language can be learned from continuous speech in the absence of …

What do self-supervised speech models know about words?

A Pasad, CM Chien, S Settle, K Livescu - Transactions of the …, 2024 - direct.mit.edu
Many self-supervised speech models (S3Ms) have been introduced over the last few years,
improving performance and data efficiency on various speech tasks. However, these …

Unsupervised speech recognition via segmental empirical output distribution matching

CK Yeh, J Chen, C Yu, D Yu - arXiv preprint arXiv:1812.09323, 2018 - arxiv.org
We consider the problem of training speech recognition systems without using any labeled
data, under the assumption that the learner can only access to the input utterances and a …

Phoneme boundary detection using learnable segmental features

F Kreuk, Y Sheena, J Keshet… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Phoneme boundary detection plays an essential first step for a variety of speech processing
applications such as speaker diarization, speech science, keyword spotting, etc. In this work …

Blind phoneme segmentation with temporal prediction errors

P Michel, O Räsänen, R Thiolliere… - arXiv preprint arXiv …, 2016 - arxiv.org
Phonemic segmentation of speech is a critical step of speech recognition systems. We
propose a novel unsupervised algorithm based on sequence prediction models such as …

A computational model of word segmentation from continuous speech using transitional probabilities of atomic acoustic events

O Räsänen - Cognition, 2011 - Elsevier
Word segmentation from continuous speech is a difficult task that is faced by human infants
when they start to learn their native language. Several studies indicate that infants might use …

Blind phone segmentation based on spectral change detection using Legendre polynomial approximation

DT Hoang, HC Wang - The Journal of the Acoustical Society of …, 2015 - pubs.aip.org
Phone segmentation involves partitioning a continuous speech signal into discrete phone
units. In this paper, a method for automatic phone segmentation without prior knowledge of …

Blind speech segmentation using spectrogram image-based features and mel cepstral coefficients

A Stan, C Valentini-Botinhao, B Orza… - 2016 IEEE Spoken …, 2016 - ieeexplore.ieee.org
This paper introduces a novel method for blind speech segmentation at a phone level based
on image processing. We consider the spectrogram of the waveform of an utterance as an …

Automatic segmentation and classification of dysfluencies in stuttering speech

P Mahesha, DS Vinod - … of the Second International Conference on …, 2016 - dl.acm.org
The automatic segmentation of pathological speech has gained importance in the field of
clinical speech processing. In the disorder like stuttering, dysfluencies present in continues …