Medical vision language pretraining: A survey

P Shrestha, S Amgain, B Khanal, CA Linte… - arXiv preprint arXiv …, 2023 - arxiv.org
Medical Vision Language Pretraining (VLP) has recently emerged as a promising solution to
the scarcity of labeled data in the medical domain. By leveraging paired/unpaired vision and …

HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition

K Yuan, V Srivastav, N Navab, N Padoy - arXiv preprint arXiv:2405.10075, 2024 - arxiv.org
Natural language could play an important role in developing generalist surgical models by
providing a broad source of supervision from raw texts. This flexible form of supervision can …

Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model

D Wang, K Yuan, C Muller, F Blanc, N Padoy… - arXiv preprint arXiv …, 2024 - arxiv.org
We present a knowledge augmentation strategy for assessing the diagnostic groups and
gait impairment from monocular gait videos. Based on a large-scale pre-trained Vision …