How does pre-trained wav2vec 2.0 perform on domain-shifted asr? an extensive benchmark on air traffic control communications
J Zuluaga-Gomez, A Prasad… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Recent work on self-supervised pre-training focus on leveraging large-scale unlabeled
speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine …
speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine …
Atco2 corpus: A large-scale dataset for research on automatic speech recognition and natural language understanding of air traffic control communications
Personal assistants, automatic speech recognizers and dialogue understanding systems are
becoming more critical in our interconnected digital world. A clear example is air traffic …
becoming more critical in our interconnected digital world. A clear example is air traffic …
A virtual simulation-pilot agent for training of air traffic controllers
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic
controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) …
controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) …
A two-step approach to leverage contextual data: speech recognition in air-traffic communications
I Nigmatulina, J Zuluaga-Gomez… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Automatic Speech Recognition (ASR), as the assistance of speech communication between
pilots and air-traffic controllers, can significantly reduce the complexity of the task and …
pilots and air-traffic controllers, can significantly reduce the complexity of the task and …
Bertraffic: Bert-based joint speaker role and speaker change detection for air traffic control communications
Automatic speech recognition (ASR) allows transcribing the communications between air
traffic controllers (ATCOs) and aircraft pilots. The transcriptions are used later to extract ATC …
traffic controllers (ATCOs) and aircraft pilots. The transcriptions are used later to extract ATC …
Automatic flight callsign identification on a controller working position: Real-time simulation and analysis of operational recordings
R García, J Albarrán, A Fabio, F Celorrio… - Aerospace, 2023 - mdpi.com
In the air traffic management (ATM) environment, air traffic controllers (ATCos) and flight
crews,(FCs) communicate via voice to exchange different types of data such as commands …
crews,(FCs) communicate via voice to exchange different types of data such as commands …
Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding
J Zuluaga-Gomez, I Nigmatulina, A Prasad… - arXiv preprint arXiv …, 2023 - arxiv.org
Voice communication between air traffic controllers (ATCos) and pilots is critical for ensuring
safe and efficient air traffic control (ATC). This task requires high levels of awareness from …
safe and efficient air traffic control (ATC). This task requires high levels of awareness from …
An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain
D Khalil, A Prasad, P Motlicek, J Zuluaga-Gomez… - Aerospace, 2023 - mdpi.com
In air traffic management (ATM), voice communications are critical for ensuring the safe and
efficient operation of aircraft. The pertinent voice communications—air traffic controller …
efficient operation of aircraft. The pertinent voice communications—air traffic controller …
Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information
Air traffic control (ATC) relies on communication via speech between pilot and air-traffic
controller (ATCO). The call-sign, as unique identifier for each flight, is used to address a …
controller (ATCO). The call-sign, as unique identifier for each flight, is used to address a …
Zero-Shot Domain-Sensitive Speech Recognition with Prompt-Conditioning Fine-Tuning
In this work, we propose a method to create domain-sensitive speech recognition models
that utilize textual domain information by conditioning its generation on a given text prompt …
that utilize textual domain information by conditioning its generation on a given text prompt …