Speechprompt v2: Prompt tuning for speech classification tasks

KW Chang, YK Wang, H Shen, I Kang… - arXiv preprint arXiv …, 2023 - arxiv.org
Prompt tuning is a technology that tunes a small set of parameters to steer a pre-trained
language model (LM) to directly generate the output for downstream tasks. Recently, prompt …

Universlu: Universal spoken language understanding for diverse classification and sequence generation tasks with a single network

S Arora, H Futami, J Jung, Y Peng, R Sharma… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent studies have demonstrated promising outcomes by employing large language
models with multi-tasking capabilities. They utilize prompts to guide the model's behavior …

Prompting whisper for qa-driven zero-shot end-to-end spoken language understanding

M Li, S Keizer, R Doddipatla - arXiv preprint arXiv:2406.15209, 2024 - arxiv.org
Zero-shot spoken language understanding (SLU) enables systems to comprehend user
utterances in new domains without prior exposure to training data. Recent studies often rely …

WHISMA: A Speech-LLM to Perform Zero-shot Spoken Language Understanding

M Li, CT Do, S Keizer, Y Farag, S Stoyanchev… - arXiv preprint arXiv …, 2024 - arxiv.org
Speech large language models (speech-LLMs) integrate speech and text-based foundation
models to provide a unified framework for handling a wide range of downstream tasks. In …

Direct enhancement of pre-trained speech embeddings for speech processing in noisy conditions

MN Ali, A Brutti, D Falavigna - Computer Speech & Language, 2023 - Elsevier
Lately, the development of deep learning algorithms has marked milestones in the field of
speech processing. In particular, the release of pre-trained feature extraction models has …

Cross-Modal Alignment for End-to-End Spoken Language Understanding Based on Momentum Contrastive Learning

B Zheng, M Ablimit, A Hamdulla - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
The end-to-end spoken language understanding system extracts the semantic intent directly
from an input speech. It effectively avoids problems such as semantic drift in traditional …

Neural Enhancement Strategies for Robust Speech Processing

MNAM Nawar - 2023 - iris.unitn.it
In real-world scenarios, speech signals are often contaminated with environmental noises,
and reverberation, which degrades speech quality and intelligibility. Lately, the development …