WavChat: A Survey of Spoken Dialogue Models
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o,
have captured significant attention in the speech domain. Compared to traditional three-tier …
have captured significant attention in the speech domain. Compared to traditional three-tier …
Voicebench: Benchmarking llm-based voice assistants
Building on the success of large language models (LLMs), recent advancements such as
GPT-4o have enabled real-time speech interactions through LLM-based voice assistants …
GPT-4o have enabled real-time speech interactions through LLM-based voice assistants …
SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context
J Li, S Tao, Y Yan, X Gu, H Xu, X Zheng, Y Lyu… - arXiv preprint arXiv …, 2024 - arxiv.org
Endeavors have been made to explore Large Language Models for video analysis (Video-
LLMs), particularly in understanding and interpreting long videos. However, existing Video …
LLMs), particularly in understanding and interpreting long videos. However, existing Video …
From Specific-MLLM to Omni-MLLM: A Survey about the MLLMs alligned with Multi-Modality
From the Specific-MLLM, which excels in single-modal tasks, to the Omni-MLLM, which
extends the range of general modalities, this evolution aims to achieve understanding and …
extends the range of general modalities, this evolution aims to achieve understanding and …
Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Autoregression in large language models (LLMs) has shown impressive scalability by
unifying all language tasks into the next token prediction paradigm. Recently, there is a …
unifying all language tasks into the next token prediction paradigm. Recently, there is a …
ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation
Recent advancements in large language models (LLMs) have expanded their application
across various domains, including chip design, where domain-adapted chip models like …
across various domains, including chip design, where domain-adapted chip models like …