What if the tv was off? examining counterfactual reasoning abilities of multi-modal language models

L Zhang, X Zhai, Z Zhao, Y Zong… - Proceedings of the …, 2024 - openaccess.thecvf.com
Counterfactual reasoning a fundamental aspect of human cognition involves contemplating
alternatives to established facts or past events significantly enhancing our abilities in …

Tuning LayerNorm in Attention: Towards efficient multi-modal llm finetuning

B Zhao, H Tu, C Wei, J Mei, C Xie - arXiv preprint arXiv:2312.11420, 2023 - arxiv.org
This paper introduces an efficient strategy to transform Large Language Models (LLMs) into
Multi-Modal Large Language Models (MLLMs). By conceptualizing this transformation as a …

Sight beyond text: Multi-modal training enhances llms in truthfulness and ethics

H Tu, B Zhao, C Wei, C Xie - arXiv preprint arXiv:2309.07120, 2023 - arxiv.org
Multi-modal large language models (MLLMs) are trained based on large language models
(LLM), with an enhanced capability to comprehend multi-modal inputs and generate textual …

Unsupervised camouflaged object segmentation as domain adaptation

Y Zhang, C Wu - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com
Deep learning for unsupervised image segmentation remains challenging due to the
absence of human labels. The common idea is to train a segmentation head, with the …