Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner

E Dupoux - Cognition, 2018 - Elsevier
Spectacular progress in the information processing sciences (machine learning, wearable
sensors) promises to revolutionize the study of cognitive development. Here, we analyse the …

Computational modeling of phonetic and lexical learning in early language acquisition: Existing models and future directions

O Räsänen - Speech Communication, 2012 - Elsevier
This work reviews a number of existing computational studies concentrated on the question
of how spoken language can be learned from continuous speech in the absence of …

Efficient spoken term discovery using randomized algorithms

A Jansen, B Van Durme - 2011 IEEE Workshop on Automatic …, 2011 - ieeexplore.ieee.org
Spoken term discovery is the task of automatically identifying words and phrases in speech
data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive …

What do self-supervised speech models know about words?

A Pasad, CM Chien, S Settle, K Livescu - Transactions of the …, 2024 - direct.mit.edu
Many self-supervised speech models (S3Ms) have been introduced over the last few years,
improving performance and data efficiency on various speech tasks. However, these …

What do self-supervised speech models know about words?

A Pasad, CM Chien, S Settle, K Livescu - arXiv preprint arXiv:2307.00162, 2023 - arxiv.org
Many self-supervised speech models (S3Ms) have been introduced over the last few years,
producing performance and data efficiency improvements for a variety of speech tasks …

Unsupervised word segmentation and lexicon discovery using acoustic word embeddings

H Kamper, A Jansen… - IEEE/ACM Transactions on …, 2016 - ieeexplore.ieee.org
In settings where only unlabeled speech data is available, speech technology needs to be
developed without transcriptions, pronunciation dictionaries, or language modelling text. A …

[PDF][PDF] Rapid evaluation of speech representations for spoken term discovery

MA Carlin, S Thomas, A Jansen… - … Annual Conference of …, 2011 - academia.edu
Acoustic front-ends are typically developed for supervised learning tasks and are thus
optimized to minimize word error rate, phone error rate, etc. However, in recent efforts to …

Semantic speech retrieval with a visually grounded model of untranscribed speech

H Kamper, G Shakhnarovich… - Proceedings of the …, 2018 - openaccess.thecvf.com
There is growing interest in speech models that can learn from unlabelled speech paired
with visual context. Here we study how a visually grounded speech model, trained on …

Unsupervised discovery of recurring speech patterns using probabilistic adaptive metrics

O Räsänen, MAC Blandón - arXiv preprint arXiv:2008.00731, 2020 - arxiv.org
Unsupervised spoken term discovery (UTD) aims at finding recurring segments of speech
from a corpus of acoustic speech data. One potential approach to this problem is to use …

Weak top-down constraints for unsupervised acoustic model training

A Jansen, S Thomas… - 2013 IEEE International …, 2013 - ieeexplore.ieee.org
Typical supervised acoustic model training relies on strong top-down constraints provided
by dynamic programming alignment of the input observations to phonetic sequences …