Pmc-clip: Contrastive language-image pre-training using biomedical documents

W Lin, Z Zhao, X Zhang, C Wu, Y Zhang… - … Conference on Medical …, 2023 - Springer
Foundation models trained on large-scale dataset gain a recent surge in CV and NLP. In
contrast, development in biomedical domain lags far behind due to data scarcity. To address …

Vision-language models for medical report generation and visual question answering: A review

I Hartsock, G Rasool - arXiv preprint arXiv:2403.02469, 2024 - arxiv.org
Medical vision-language models (VLMs) combine computer vision and natural language
processing to analyze visual and textual medical data. Our paper reviews recent …

Retrieving multimodal information for augmented generation: A survey

R Zhao, H Chen, W Wang, F Jiao, XL Do, C Qin… - arXiv preprint arXiv …, 2023 - arxiv.org
As Large Language Models (LLMs) become popular, there emerged an important trend of
using multimodality to augment the LLMs' generation ability, which enables LLMs to better …

A survey of pre-trained language models for processing scientific text

X Ho, AKD Nguyen, AT Dao, J Jiang, Y Chida… - arXiv preprint arXiv …, 2024 - arxiv.org
The number of Language Models (LMs) dedicated to processing scientific text is on the rise.
Keeping pace with the rapid growth of scientific LMs (SciLMs) has become a daunting task …

MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning

S He, Y Nie, Z Chen, Z Cai, H Wang, S Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid advancement of large-scale vision-language models has showcased remarkable
capabilities across various tasks. However, the lack of extensive and high-quality image-text …

Lader: Log-augmented dense retrieval for biomedical literature search

Q Jin, A Shin, Z Lu - Proceedings of the 46th International ACM SIGIR …, 2023 - dl.acm.org
Queries with similar information needs tend to have similar document clicks, especially in
biomedical literature search engines where queries are generally short and top documents …

RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

P Xia, K Zhu, H Li, H Zhu, Y Li, G Li, L Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
The recent emergence of Medical Large Vision Language Models (Med-LVLMs) has
enhanced medical diagnosis. However, current Med-LVLMs frequently encounter factual …

Semantic Alignment for Multimodal Large Language Models

T Wu, M Li, J Chen, W Ji, W Lin, J Gao, K Kuang… - arXiv preprint arXiv …, 2024 - arxiv.org
Research on Multi-modal Large Language Models (MLLMs) towards the multi-image cross-
modal instruction has received increasing attention and made significant progress …

A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions

L Liu, X Yang, J Lei, X Liu, Y Shen, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs), such as GPT series models, have received substantial
attention due to their impressive capabilities for generating and understanding human-level …

[PDF][PDF] Adapting pretrained vision-language models in medical domains

L Li - 2024 - mlmi.eng.cam.ac.uk
With the rise of large-scale vision-language pretraining (VLP), tremendous progress has
been achieved in handling complex multi-modal information in general domain. However …