A Survey of Multimodal Large Language Model from A Data-centric Perspective
Human beings perceive the world through diverse senses such as sight, smell, hearing, and
touch. Similarly, multimodal large language models (MLLMs) enhance the capabilities of …
touch. Similarly, multimodal large language models (MLLMs) enhance the capabilities of …
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
The rapid development of large language models (LLMs) has been witnessed in recent
years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from …
years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from …
SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models
Recently, with the rise of web images, managing and understanding large-scale image
datasets has become increasingly important. Vision Large Language Models (VLLMs) have …
datasets has become increasingly important. Vision Large Language Models (VLLMs) have …
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark
With the development of Multimodal Large Language Models (MLLMs), the evaluation of
multimodal models in the context of mathematical problems has become a valuable …
multimodal models in the context of mathematical problems has become a valuable …
Tuning-Free Alignment of Diffusion Models with Direct Noise Optimization
In this work, we focus on the alignment problem of diffusion models with a continuous
reward function, which represents specific objectives for downstream tasks, such as …
reward function, which represents specific objectives for downstream tasks, such as …