Omnibench: Towards the future of universal omni-language models
Recent advancements in multimodal large language models (MLLMs) have aimed to
integrate and interpret data across diverse modalities. However, the capacity of these …
integrate and interpret data across diverse modalities. However, the capacity of these …
SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
Large multimodal models (LMMs) have proven flexible and generalisable across many tasks
and fields. Although they have strong potential to aid scientific research, their capabilities in …
and fields. Although they have strong potential to aid scientific research, their capabilities in …
MoE-SLU: Towards ASR-Robust Spoken Language Understanding via Mixture-of-Experts
X Cheng, Z Zhu, X Zhuang, Z Chen… - Findings of the …, 2024 - aclanthology.org
As a crucial task in the task-oriented dialogue systems, spoken language understanding
(SLU) has garnered increasing attention. However, errors from automatic speech …
(SLU) has garnered increasing attention. However, errors from automatic speech …
MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models
Given the remarkable success that large visual language models (LVLMs) have achieved in
image perception tasks, the endeavor to make LVLMs perceive the world like humans is …
image perception tasks, the endeavor to make LVLMs perceive the world like humans is …
LIME: Less Is More for MLLM Evaluation
Multimodal Large Language Models (MLLMs) are evaluated on various benchmarks, such
as image captioning, visual question answering, and reasoning. However, many of these …
as image captioning, visual question answering, and reasoning. However, many of these …
Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers
A Tsanda, E Bruches - arXiv preprint arXiv:2405.07886, 2024 - arxiv.org
The paper discusses the creation of a multimodal dataset of Russian-language scientific
papers and testing of existing language models for the task of automatic text summarization …
papers and testing of existing language models for the task of automatic text summarization …
Harnessing CLIP for Evidence Identification in Scientific Literature: A Multimodal Approach to Context24 Shared Task
A Kumar, LL Wang - Proceedings of the Fourth Workshop on …, 2024 - aclanthology.org
Knowing whether scientific claims are supported by evidence is fundamental to scholarly
communication and evidence-based decision-making. We present our approach to Task 1 of …
communication and evidence-based decision-making. We present our approach to Task 1 of …