Halc: Object hallucination reduction via adaptive focal-contrast decoding

Z Chen, Z Zhao, H Luo, H Yao, B Li, J Zhou - arXiv preprint arXiv …, 2024 - arxiv.org
While large vision-language models (LVLMs) have demonstrated impressive capabilities in
interpreting multi-modal contexts, they invariably suffer from object hallucinations (OH). We …

Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios

Y Zhou, W Ai - arXiv preprint arXiv:2406.05322, 2024 - arxiv.org
There is increasing interest in distilling task-specific knowledge from large language models
(LLM) to smaller student models. Nonetheless, LLM distillation presents a dual challenge: 1) …

CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

P Xia, Z Chen, J Tian, Y Gong, R Hou, Y Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Artificial intelligence has significantly impacted medical applications, particularly with the
advent of Medical Large Vision Language Models (Med-LVLMs), sparking optimism for the …

RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

P Xia, K Zhu, H Li, H Zhu, Y Li, G Li, L Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
The recent emergence of Medical Large Vision Language Models (Med-LVLMs) has
enhanced medical diagnosis. However, current Med-LVLMs frequently encounter factual …

Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation

Y Zhou, J Zhu, P Xu, X Liu, X Wang, D Koutra… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have significantly advanced various natural language
processing tasks, but deploying them remains computationally expensive. Knowledge …

CSRec: Rethinking Sequential Recommendation from A Causal Perspective

X Liu, J Yuan, Y Zhou, J Li, F Huang, W Ai - arXiv preprint arXiv …, 2024 - arxiv.org
The essence of sequential recommender systems (RecSys) lies in understanding how users
make decisions. Most existing approaches frame the task as sequential prediction based on …