Mic2mic: using cycle-consistent generative adversarial networks to overcome microphone variability in speech systems

A Mathur, A Isopoussu, F Kawsar, N Berthouze… - Proceedings of the 18th …, 2019 - dl.acm.org
Mobile and embedded devices are increasingly using microphones and audio-based
computational models to infer user context. A major challenge in building systems that …

Acoustic matching by embedding impulse responses

J Su, Z Jin, A Finkelstein - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
The goal of acoustic matching is to transform an audio recording made in one acoustic
environment to sound as if it had been recorded in a different environment, based on …

Speaker-aware long short-term memory multi-task learning for speech recognition

G Pironkov, S Dupont, T Dutoit - 2016 24th European Signal …, 2016 - ieeexplore.ieee.org
In order to address the commonly met issue of overfitting in speech recognition, this article
investigates Multi-Task Learning, when the auxiliary task focuses on speaker classification …

Embeddings for dnn speaker adaptive training

J Rownicka, P Bell, S Renals - 2019 IEEE Automatic Speech …, 2019 - ieeexplore.ieee.org
In this work, we investigate the use of embeddings for speaker-adaptive training of DNNs
(DNN-SAT) focusing on a small amount of adaptation data per speaker. DNN-SAT can be …

Hybrid-task learning for robust automatic speech recognition

G Pironkov, SUN Wood, S Dupont - Computer Speech & Language, 2020 - Elsevier
In order to properly train an automatic speech recognition system, speech with its annotated
transcriptions is most often required. The amount of real annotated data recorded in noisy …

[HTML][HTML] iPREDICT: AI enabled proactive pandemic prediction using biosensing wearable devices

MS Riaz, M Shaukat, T Saeed, A Ijaz… - Informatics in Medicine …, 2024 - Elsevier
The emergence of pandemics poses a persistent threat to both global health and economic
stability. While zoonotic spillovers and local outbreaks may not be fully preventable, early …

Domain aware training for far-field small-footprint keyword spotting

H Wu, Y Jia, Y Nie, M Li - arXiv preprint arXiv:2005.03633, 2020 - arxiv.org
In this paper, we focus on the task of small-footprint keyword spotting under the far-field
scenario. Far-field environments are commonly encountered in real-life speech applications …

[PDF][PDF] Generating TTS Based Adversarial Samples for Training Wake-Up Word Detection Systems Against Confusing Words.

H Wang, Y Jia, Z Zhao, X Wang, J Wang, M Li - Odyssey, 2022 - isca-archive.org
Wake-up word detection models are widely used in real life, but suffer from severe
performance degradation when encountering adversarial samples. In this paper we discuss …

Progressive neural network-based knowledge transfer in acoustic models

T Moriya, R Masumura, T Asami… - 2018 Asia-Pacific …, 2018 - ieeexplore.ieee.org
This paper presents a novel deep neural network architecture for transfer learning in
acoustic models. A well-known approach for transfer leaning is using target domain data to …

On comparison of deep learning architectures for distant speech recognition

R Sustika, AR Yuliani, E Zaenudin… - 2017 2nd International …, 2017 - ieeexplore.ieee.org
Deep Learning technologies are becoming the major approaches for natural signal and
information processings, including for speech recognition. Many architectures for deep …