Sok: The faults in our asrs: An overview of attacks against automatic speech recognition and speaker identification systems
Speech and speaker recognition systems are employed in a variety of applications, from
personal assistants to telephony surveillance and biometric authentication. The wide …
personal assistants to telephony surveillance and biometric authentication. The wide …
Adaptation algorithms for neural network-based speech recognition: An overview
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …
recognition, considering both hybrid hidden Markov model/neural network systems and end …
Unsupervised speech recognition
Despite rapid progress in the recent past, current speech recognition systems still require
labeled training data which limits this technology to a small fraction of the languages spoken …
labeled training data which limits this technology to a small fraction of the languages spoken …
Audio adversarial examples: Targeted attacks on speech-to-text
We construct targeted audio adversarial examples on automatic speech recognition. Given
any audio waveform, we can produce another that is over 99.9% similar, but transcribes as …
any audio waveform, we can produce another that is over 99.9% similar, but transcribes as …
Accurate, data-efficient, unconstrained text recognition with convolutional neural networks
M Yousef, KF Hussain, US Mohammed - Pattern Recognition, 2020 - Elsevier
Unconstrained text recognition is an important computer vision task, featuring a wide variety
of different sub-tasks, each with its own set of challenges. One of the biggest promises of …
of different sub-tasks, each with its own set of challenges. One of the biggest promises of …
Lprnet: License plate recognition via deep neural networks
S Zherzdev, A Gruzdev - arXiv preprint arXiv:1806.10447, 2018 - arxiv.org
This paper proposes LPRNet-end-to-end method for Automatic License Plate Recognition
without preliminary character segmentation. Our approach is inspired by recent …
without preliminary character segmentation. Our approach is inspired by recent …
Metamorph: Injecting inaudible commands into over-the-air voice controlled systems
This paper presents Metamorph, a system that generates imperceptible audio that can
survive over-the-air trans-mission to attack the neural network of a speech recognition …
survive over-the-air trans-mission to attack the neural network of a speech recognition …
Specpatch: Human-in-the-loop adversarial audio spectrogram patch attack on speech recognition
In this paper, we propose SpecPatch, a human-in-the loop adversarial audio attack on
automated speech recognition (ASR) systems. Existing audio adversarial attacker assumes …
automated speech recognition (ASR) systems. Existing audio adversarial attacker assumes …
Llms are good sign language translators
Abstract Sign Language Translation (SLT) is a challenging task that aims to translate sign
videos into spoken language. Inspired by the strong translation capabilities of large …
videos into spoken language. Inspired by the strong translation capabilities of large …
Multi-task learning for speaker verification and voice trigger detection
Automatic speech transcription and speaker recognition are usually treated as separate
tasks even though they are interdependent. In this study, we investigate training a single …
tasks even though they are interdependent. In this study, we investigate training a single …