Speechprompt v2: Prompt tuning for speech classification tasks
Prompt tuning is a technology that tunes a small set of parameters to steer a pre-trained
language model (LM) to directly generate the output for downstream tasks. Recently, prompt …
language model (LM) to directly generate the output for downstream tasks. Recently, prompt …
Universlu: Universal spoken language understanding for diverse classification and sequence generation tasks with a single network
Recent studies have demonstrated promising outcomes by employing large language
models with multi-tasking capabilities. They utilize prompts to guide the model's behavior …
models with multi-tasking capabilities. They utilize prompts to guide the model's behavior …
Prompting whisper for qa-driven zero-shot end-to-end spoken language understanding
Zero-shot spoken language understanding (SLU) enables systems to comprehend user
utterances in new domains without prior exposure to training data. Recent studies often rely …
utterances in new domains without prior exposure to training data. Recent studies often rely …
WHISMA: A Speech-LLM to Perform Zero-shot Spoken Language Understanding
Speech large language models (speech-LLMs) integrate speech and text-based foundation
models to provide a unified framework for handling a wide range of downstream tasks. In …
models to provide a unified framework for handling a wide range of downstream tasks. In …
Direct enhancement of pre-trained speech embeddings for speech processing in noisy conditions
Lately, the development of deep learning algorithms has marked milestones in the field of
speech processing. In particular, the release of pre-trained feature extraction models has …
speech processing. In particular, the release of pre-trained feature extraction models has …
Cross-Modal Alignment for End-to-End Spoken Language Understanding Based on Momentum Contrastive Learning
B Zheng, M Ablimit, A Hamdulla - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
The end-to-end spoken language understanding system extracts the semantic intent directly
from an input speech. It effectively avoids problems such as semantic drift in traditional …
from an input speech. It effectively avoids problems such as semantic drift in traditional …
Neural Enhancement Strategies for Robust Speech Processing
MNAM Nawar - 2023 - iris.unitn.it
In real-world scenarios, speech signals are often contaminated with environmental noises,
and reverberation, which degrades speech quality and intelligibility. Lately, the development …
and reverberation, which degrades speech quality and intelligibility. Lately, the development …