Pay less attention with lightweight and dynamic convolutions
Self-attention is a useful mechanism to build generative models for language and images. It
determines the importance of context elements by comparing each element to the current …
determines the importance of context elements by comparing each element to the current …
Generalized end-to-end loss for speaker verification
In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss,
which makes the training of speaker verification models more efficient than our previous …
which makes the training of speaker verification models more efficient than our previous …
Speaker diarization with LSTM
For many years, i-vector based audio embedding techniques were the dominant approach
for speaker verification and speaker diarization applications. However, mirroring the rise of …
for speaker verification and speaker diarization applications. However, mirroring the rise of …
End-to-end text-dependent speaker verification
In this paper we present a data-driven, integrated approach to speaker verification, which
maps a test utterance and a few reference utterances directly to a single score for verification …
maps a test utterance and a few reference utterances directly to a single score for verification …
[HTML][HTML] A deep neural network model for speaker identification
F Ye, J Yang - Applied Sciences, 2021 - mdpi.com
Speaker identification is a classification task which aims to identify a subject from a given
time-series sequential data. Since the speech signal is a continuous one-dimensional time …
time-series sequential data. Since the speech signal is a continuous one-dimensional time …
Personalized speech recognition on mobile devices
We describe a large vocabulary speech recognition system that is accurate, has low latency,
and yet has a small enough memory and computational footprint to run faster than real-time …
and yet has a small enough memory and computational footprint to run faster than real-time …
Trainable frontend for robust and far-field keyword spotting
Robust and far-field speech recognition is critical to enable true hands-free communication.
In far-field conditions, signals are attenuated due to distance. To improve robustness to …
In far-field conditions, signals are attenuated due to distance. To improve robustness to …
Convolutional CRFs for semantic segmentation
MTT Teichmann, R Cipolla - arXiv preprint arXiv:1805.04777, 2018 - arxiv.org
For the challenging semantic image segmentation task the most efficient models have
traditionally combined the structured modelling capabilities of Conditional Random Fields …
traditionally combined the structured modelling capabilities of Conditional Random Fields …
Robust detection of machine-induced audio attacks in intelligent audio systems with microphone array
With the popularity of intelligent audio systems in recent years, their vulnerabilities have
become an increasing public concern. Existing studies have designed a set of machine …
become an increasing public concern. Existing studies have designed a set of machine …
Deeplss: Breaking parameter degeneracies in large-scale structure with deep-learning analysis of combined probes
T Kacprzak, J Fluri - Physical Review X, 2022 - APS
In classical cosmological analysis of large-scale structure surveys with two-point functions,
the parameter measurement precision is limited by several key degeneracies within the …
the parameter measurement precision is limited by several key degeneracies within the …