Mic2mic: using cycle-consistent generative adversarial networks to overcome microphone variability in speech systems
Mobile and embedded devices are increasingly using microphones and audio-based
computational models to infer user context. A major challenge in building systems that …
computational models to infer user context. A major challenge in building systems that …
Acoustic matching by embedding impulse responses
The goal of acoustic matching is to transform an audio recording made in one acoustic
environment to sound as if it had been recorded in a different environment, based on …
environment to sound as if it had been recorded in a different environment, based on …
Speaker-aware long short-term memory multi-task learning for speech recognition
In order to address the commonly met issue of overfitting in speech recognition, this article
investigates Multi-Task Learning, when the auxiliary task focuses on speaker classification …
investigates Multi-Task Learning, when the auxiliary task focuses on speaker classification …
Embeddings for dnn speaker adaptive training
In this work, we investigate the use of embeddings for speaker-adaptive training of DNNs
(DNN-SAT) focusing on a small amount of adaptation data per speaker. DNN-SAT can be …
(DNN-SAT) focusing on a small amount of adaptation data per speaker. DNN-SAT can be …
Hybrid-task learning for robust automatic speech recognition
In order to properly train an automatic speech recognition system, speech with its annotated
transcriptions is most often required. The amount of real annotated data recorded in noisy …
transcriptions is most often required. The amount of real annotated data recorded in noisy …
[HTML][HTML] iPREDICT: AI enabled proactive pandemic prediction using biosensing wearable devices
The emergence of pandemics poses a persistent threat to both global health and economic
stability. While zoonotic spillovers and local outbreaks may not be fully preventable, early …
stability. While zoonotic spillovers and local outbreaks may not be fully preventable, early …
Domain aware training for far-field small-footprint keyword spotting
H Wu, Y Jia, Y Nie, M Li - arXiv preprint arXiv:2005.03633, 2020 - arxiv.org
In this paper, we focus on the task of small-footprint keyword spotting under the far-field
scenario. Far-field environments are commonly encountered in real-life speech applications …
scenario. Far-field environments are commonly encountered in real-life speech applications …
[PDF][PDF] Generating TTS Based Adversarial Samples for Training Wake-Up Word Detection Systems Against Confusing Words.
Wake-up word detection models are widely used in real life, but suffer from severe
performance degradation when encountering adversarial samples. In this paper we discuss …
performance degradation when encountering adversarial samples. In this paper we discuss …
Progressive neural network-based knowledge transfer in acoustic models
T Moriya, R Masumura, T Asami… - 2018 Asia-Pacific …, 2018 - ieeexplore.ieee.org
This paper presents a novel deep neural network architecture for transfer learning in
acoustic models. A well-known approach for transfer leaning is using target domain data to …
acoustic models. A well-known approach for transfer leaning is using target domain data to …
On comparison of deep learning architectures for distant speech recognition
Deep Learning technologies are becoming the major approaches for natural signal and
information processings, including for speech recognition. Many architectures for deep …
information processings, including for speech recognition. Many architectures for deep …