A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Self-supervised speech representation learning: A review
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …
necessitated the building of specialist models for individual tasks and application scenarios …
Listen, think, and understand
The ability of artificial intelligence (AI) systems to perceive and comprehend audio signals is
crucial for many applications. Although significant progress has been made in this area …
crucial for many applications. Although significant progress has been made in this area …
Prompting the hidden talent of web-scale speech models for zero-shot task generalization
We investigate the emergent abilities of the recently proposed web-scale speech model
Whisper, by adapting it to unseen tasks with prompt engineering. We selected three tasks …
Whisper, by adapting it to unseen tasks with prompt engineering. We selected three tasks …
Can chatgpt detect intent? evaluating large language models for spoken language understanding
Recently, large pretrained language models have demonstrated strong language
understanding capabilities. This is particularly reflected in their zero-shot and in-context …
understanding capabilities. This is particularly reflected in their zero-shot and in-context …
Speechprompt v2: Prompt tuning for speech classification tasks
Prompt tuning is a technology that tunes a small set of parameters to steer a pre-trained
language model (LM) to directly generate the output for downstream tasks. Recently, prompt …
language model (LM) to directly generate the output for downstream tasks. Recently, prompt …
Exploring efficient-tuning methods in self-supervised speech models
In this study, we aim to explore efficient tuning methods for speech self-supervised learning.
Recent studies show that self-supervised learning (SSL) can learn powerful representations …
Recent studies show that self-supervised learning (SSL) can learn powerful representations …
Speechgen: Unlocking the generative power of speech language models with prompts
Large language models (LLMs) have gained considerable attention for Artificial Intelligence
Generated Content (AIGC), particularly with the emergence of ChatGPT. However, the direct …
Generated Content (AIGC), particularly with the emergence of ChatGPT. However, the direct …
From english to more languages: Parameter-efficient model reprogramming for cross-lingual speech recognition
In this work, we propose a new parameter-efficient learning framework based on neural
model reprogramming for cross-lingual speech recognition, which can re-purpose well …
model reprogramming for cross-lingual speech recognition, which can re-purpose well …
Toward universal speech enhancement for diverse input conditions
The past decade has witnessed substantial growth of data-driven speech enhancement (SE)
techniques thanks to deep learning. While existing approaches have shown impressive …
techniques thanks to deep learning. While existing approaches have shown impressive …