A comprehensive overview of large language models
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …
natural language processing tasks and beyond. This success of LLMs has led to a large …
A survey on multimodal large language models
Multimodal Large Language Model (MLLM) recently has been a new rising research
hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform …
hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform …
Generative AI and ChatGPT: Applications, challenges, and AI-human collaboration
F Fui-Hoon Nah, R Zheng, J Cai, K Siau… - Journal of Information …, 2023 - Taylor & Francis
Artificial intelligence (AI) has elicited much attention across disciplines and industries (Hyder
et al., 2019). AI has been defined as “a system's ability to correctly interpret external data, to …
et al., 2019). AI has been defined as “a system's ability to correctly interpret external data, to …
Vila: On pre-training for visual language models
Visual language models (VLMs) rapidly progressed with the recent success of large
language models. There have been growing efforts on visual instruction tuning to extend the …
language models. There have been growing efforts on visual instruction tuning to extend the …
Multimodal foundation models: From specialists to general-purpose assistants
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …
methods to data compression. Recent advances in statistical machine learning have opened …
Sharegpt4v: Improving large multi-modal models with better captions
In the realm of large multi-modal models (LMMs), efficient modality alignment is crucial yet
often constrained by the scarcity of high-quality image-text data. To address this bottleneck …
often constrained by the scarcity of high-quality image-text data. To address this bottleneck …
Chat-univi: Unified visual representation empowers large language models with image and video understanding
P Jin, R Takanobu, W Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …
range of open-ended tasks and have extended their utility to encompass multimodal …
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition
We propose InternLM-XComposer, a vision-language large model that enables advanced
image-text comprehension and composition. The innovative nature of our model is …
image-text comprehension and composition. The innovative nature of our model is …
How Robust is Google's Bard to Adversarial Image Attacks?
Multimodal Large Language Models (MLLMs) that integrate text and other modalities
(especially vision) have achieved unprecedented performance in various multimodal tasks …
(especially vision) have achieved unprecedented performance in various multimodal tasks …
The (r) evolution of multimodal large language models: A survey
Connecting text and visual modalities plays an essential role in generative intelligence. For
this reason, inspired by the success of large language models, significant research efforts …
this reason, inspired by the success of large language models, significant research efforts …