Salmonn: Towards generic hearing abilities for large language models
Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical
world, which refers to the perception and understanding of general auditory information …
world, which refers to the perception and understanding of general auditory information …
Sparks of large audio models: A survey and outlook
This survey paper provides a comprehensive overview of the recent advancements and
challenges in applying large language models to the field of audio signal processing. Audio …
challenges in applying large language models to the field of audio signal processing. Audio …
Connecting speech encoder and large language model for asr
The impressive capability and versatility of large language models (LLMs) have aroused
increasing attention in automatic speech recognition (ASR), with several pioneering studies …
increasing attention in automatic speech recognition (ASR), with several pioneering studies …
Can Whisper Perform Speech-Based In-Context Learning?
This paper investigates the in-context learning abilities of the Whisper automatic speech
recognition (ASR) models released by OpenAI. A novel speech-based in-context learning …
recognition (ASR) models released by OpenAI. A novel speech-based in-context learning …
Salm: Speech-augmented language model with in-context learning for speech recognition and translation
We present a novel Speech Augmented Language Model (SALM) with multitask and in-
context learning capabilities. SALM comprises a frozen text LLM, a audio encoder, a …
context learning capabilities. SALM comprises a frozen text LLM, a audio encoder, a …
Cosmic: Data efficient instruction-tuning for speech in-context learning
We present a data and cost efficient way of incorporating the speech modality into a large
language model (LLM). The resulting multi-modal LLM is a COntextual Speech Model with …
language model (LLM). The resulting multi-modal LLM is a COntextual Speech Model with …
End-to-end speech recognition contextualization with large language models
In recent years, Large Language Models (LLMs) have garnered significant attention from the
research community due to their exceptional performance and generalization capabilities. In …
research community due to their exceptional performance and generalization capabilities. In …
Boosting large language model for speech synthesis: An empirical study
Large language models (LLMs) have made significant advancements in natural language
processing and are concurrently extending the language ability to other modalities, such as …
processing and are concurrently extending the language ability to other modalities, such as …
Exploring autonomous agents through the lens of large language models: A review
S Barua - arXiv preprint arXiv:2404.04442, 2024 - arxiv.org
Large Language Models (LLMs) are transforming artificial intelligence, enabling
autonomous agents to perform diverse tasks across various domains. These agents …
autonomous agents to perform diverse tasks across various domains. These agents …
Transllama: Llm-based simultaneous translation system
Decoder-only large language models (LLMs) have recently demonstrated impressive
capabilities in text generation and reasoning. Nonetheless, they have limited applications in …
capabilities in text generation and reasoning. Nonetheless, they have limited applications in …