Ramm: Retrieval-augmented biomedical visual question answering with multi-modal pre-training

W Lin, Z Zhao, X Zhang, C Wu, Y Zhang… - … Conference on Medical …, 2023 - Springer

Foundation models trained on large-scale dataset gain a recent surge in CV and NLP. In
contrast, development in biomedical domain lags far behind due to data scarcity. To address …

被引用次数：77 相关文章所有 6 个版本

[PDF] arxiv.org

Vision-language models for medical report generation and visual question answering: A review

I Hartsock, G Rasool - arXiv preprint arXiv:2403.02469, 2024 - arxiv.org

Medical vision-language models (VLMs) combine computer vision and natural language
processing to analyze visual and textual medical data. Our paper reviews recent …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

Retrieving multimodal information for augmented generation: A survey

R Zhao, H Chen, W Wang, F Jiao, XL Do, C Qin… - arXiv preprint arXiv …, 2023 - arxiv.org

As Large Language Models (LLMs) become popular, there emerged an important trend of
using multimodality to augment the LLMs' generation ability, which enables LLMs to better …

被引用次数：34 相关文章所有 5 个版本

[PDF] arxiv.org

A survey of pre-trained language models for processing scientific text

X Ho, AKD Nguyen, AT Dao, J Jiang, Y Chida… - arXiv preprint arXiv …, 2024 - arxiv.org

The number of Language Models (LMs) dedicated to processing scientific text is on the rise.
Keeping pace with the rapid growth of scientific LMs (SciLMs) has become a daunting task …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning

S He, Y Nie, Z Chen, Z Cai, H Wang, S Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid advancement of large-scale vision-language models has showcased remarkable
capabilities across various tasks. However, the lack of extensive and high-quality image-text …

被引用次数：4 相关文章所有 2 个版本

[HTML] nih.gov

Lader: Log-augmented dense retrieval for biomedical literature search

Q Jin, A Shin, Z Lu - Proceedings of the 46th International ACM SIGIR …, 2023 - dl.acm.org

Queries with similar information needs tend to have similar document clicks, especially in
biomedical literature search engines where queries are generally short and top documents …

[PDF][PDF] Adapting pretrained vision-language models in medical domains

L Li - 2024 - mlmi.eng.cam.ac.uk

With the rise of large-scale vision-language pretraining (VLP), tremendous progress has
been achieved in handling complex multi-modal information in general domain. However …

被引用次数：2 相关文章