Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey

P Sahoo, P Meharia, A Ghosh, S Saha, V Jain… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid advancement of foundation models (FMs) across language, image, audio, and
video domains has shown remarkable capabilities in diverse tasks. However, the …

Domain Adaptation for Contrastive Audio-Language Models

S Deshmukh, R Singh, B Raj - arXiv preprint arXiv:2402.09585, 2024 - arxiv.org
Audio-Language Models (ALM) aim to be general-purpose audio models by providing zero-
shot capabilities at test time. The zero-shot performance of ALM improves by using suitable …