The shaky foundations of large language models and foundation models for electronic health records

M Wornow, Y Xu, R Thapa, B Patel, E Steinberg… - npj Digital …, 2023 - nature.com
The success of foundation models such as ChatGPT and AlphaFold has spurred significant
interest in building similar models for electronic medical records (EMRs) to improve patient …

Pre-trained language models in biomedical domain: A systematic survey

B Wang, Q Xie, J Pei, Z Chen, P Tiwari, Z Li… - ACM Computing …, 2023 - dl.acm.org
Pre-trained language models (PLMs) have been the de facto paradigm for most natural
language processing tasks. This also benefits the biomedical domain: researchers from …

Llava-med: Training a large language-and-vision assistant for biomedicine in one day

C Li, C Wong, S Zhang, N Usuyama… - Advances in …, 2024 - proceedings.neurips.cc
Conversational generative AI has demonstrated remarkable promise for empowering
biomedical practitioners, but current investigations focus on unimodal text. Multimodal …

Knowledge-enhanced visual-language pre-training on chest radiology images

X Zhang, C Wu, Y Zhang, W Xie, Y Wang - Nature Communications, 2023 - nature.com
While multi-modal foundation models pre-trained on large-scale data have been successful
in natural language understanding and vision recognition, their use in medical domains is …

Learning to exploit temporal structure for biomedical vision-language processing

S Bannur, S Hyland, Q Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Self-supervised learning in vision--language processing (VLP) exploits semantic alignment
between imaging and text modalities. Prior work in biomedical VLP has mostly relied on the …

[PDF][PDF] Large-scale domain-specific pretraining for biomedical vision-language processing

S Zhang, Y Xu, N Usuyama, J Bagga… - arXiv preprint arXiv …, 2023 - researchgate.net
Contrastive pretraining on parallel image-text data has attained great success in vision-
language processing (VLP), as exemplified by CLIP and related methods. However, prior …

Med-unic: Unifying cross-lingual medical vision-language pre-training by diminishing bias

Z Wan, C Liu, M Zhang, J Fu, B Wang… - Advances in …, 2024 - proceedings.neurips.cc
The scarcity of data presents a critical obstacle to the efficacy of medical vision-language pre-
training (VLP). A potential solution lies in the combination of datasets from various language …

Medklip: Medical knowledge enhanced language-image pre-training for x-ray diagnosis

C Wu, X Zhang, Y Zhang, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we consider enhancing medical visual-language pre-training (VLP) with
domain-specific knowledge, by exploiting the paired image-text reports from the radiological …

A medical multimodal large language model for future pandemics

F Liu, T Zhu, X Wu, B Yang, C You, C Wang, L Lu… - NPJ Digital …, 2023 - nature.com
Deep neural networks have been integrated into the whole clinical decision procedure
which can improve the efficiency of diagnosis and alleviate the heavy workload of …

Prior: Prototype representation joint learning from medical images and reports

P Cheng, L Lin, J Lyu, Y Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Contrastive learning based vision-language joint pre-training has emerged as a successful
representation learning strategy. In this paper, we present a prototype representation …