Sok: The faults in our asrs: An overview of attacks against automatic speech recognition and speaker identification systems

H Abdullah, K Warren, V Bindschaedler… - … IEEE symposium on …, 2021 - ieeexplore.ieee.org
Speech and speaker recognition systems are employed in a variety of applications, from
personal assistants to telephony surveillance and biometric authentication. The wide …

Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

Unsupervised speech recognition

A Baevski, WN Hsu, A Conneau… - Advances in Neural …, 2021 - proceedings.neurips.cc
Despite rapid progress in the recent past, current speech recognition systems still require
labeled training data which limits this technology to a small fraction of the languages spoken …

Audio adversarial examples: Targeted attacks on speech-to-text

N Carlini, D Wagner - 2018 IEEE security and privacy …, 2018 - ieeexplore.ieee.org
We construct targeted audio adversarial examples on automatic speech recognition. Given
any audio waveform, we can produce another that is over 99.9% similar, but transcribes as …

Accurate, data-efficient, unconstrained text recognition with convolutional neural networks

M Yousef, KF Hussain, US Mohammed - Pattern Recognition, 2020 - Elsevier
Unconstrained text recognition is an important computer vision task, featuring a wide variety
of different sub-tasks, each with its own set of challenges. One of the biggest promises of …

Lprnet: License plate recognition via deep neural networks

S Zherzdev, A Gruzdev - arXiv preprint arXiv:1806.10447, 2018 - arxiv.org
This paper proposes LPRNet-end-to-end method for Automatic License Plate Recognition
without preliminary character segmentation. Our approach is inspired by recent …

Metamorph: Injecting inaudible commands into over-the-air voice controlled systems

T Chen, L Shangguan, Z Li, K Jamieson - Network and Distributed …, 2020 - par.nsf.gov
This paper presents Metamorph, a system that generates imperceptible audio that can
survive over-the-air trans-mission to attack the neural network of a speech recognition …

Specpatch: Human-in-the-loop adversarial audio spectrogram patch attack on speech recognition

H Guo, Y Wang, N Ivanov, L Xiao, Q Yan - Proceedings of the 2022 ACM …, 2022 - dl.acm.org
In this paper, we propose SpecPatch, a human-in-the loop adversarial audio attack on
automated speech recognition (ASR) systems. Existing audio adversarial attacker assumes …

Llms are good sign language translators

J Gong, LG Foo, Y He… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Sign Language Translation (SLT) is a challenging task that aims to translate sign
videos into spoken language. Inspired by the strong translation capabilities of large …

Multi-task learning for speaker verification and voice trigger detection

S Sigtia, E Marchi, S Kajarekar, D Naik… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
Automatic speech transcription and speaker recognition are usually treated as separate
tasks even though they are interdependent. In this study, we investigate training a single …