Parrot: Multilingual Visual Instruction Tuning
The rapid development of Multimodal Large Language Models (MLLMs) like GPT-4V has
marked a significant step towards artificial general intelligence. Existing methods mainly …
marked a significant step towards artificial general intelligence. Existing methods mainly …
TroL: Traversal of Layers for Large Language and Vision Models
Large language and vision models (LLVMs) have been driven by the generalization power
of large language models (LLMs) and the advent of visual instruction tuning. Along with …
of large language models (LLMs) and the advent of visual instruction tuning. Along with …
CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models
Large Multi-modal Models (LMMs) have recently demonstrated remarkable abilities in visual
context understanding and coherent response generation. However, alongside these …
context understanding and coherent response generation. However, alongside these …