Has multimodal learning delivered universal intelligence in healthcare? A comprehensive survey
The rapid development of artificial intelligence has constantly reshaped the field of
intelligent healthcare and medicine. As a vital technology, multimodal learning has …
intelligent healthcare and medicine. As a vital technology, multimodal learning has …
Maira-2: Grounded radiology report generation
Radiology reporting is a complex task requiring detailed medical image understanding and
precise language generation, for which generative multimodal models offer a promising …
precise language generation, for which generative multimodal models offer a promising …
Maira-1: A specialised large multimodal model for radiology report generation
We present a radiology-specific multimodal model for the task for generating radiological
reports from chest X-rays (CXRs). Our work builds on the idea that large language model (s) …
reports from chest X-rays (CXRs). Our work builds on the idea that large language model (s) …
Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?
Medical Vision-Language Pre-training (MedVLP) has made significant progress in enabling
zero-shot tasks for medical image understanding. However, training MedVLP models …
zero-shot tasks for medical image understanding. However, training MedVLP models …
Medimageinsight: An open-source embedding model for general domain medical imaging
In this work, we present MedImageInsight, an open-source medical imaging embedding
model. MedImageInsight is trained on medical images with associated text and labels …
model. MedImageInsight is trained on medical images with associated text and labels …
DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training
Diffusion models (DMs) have emerged as powerful foundation models for a variety of tasks,
with a large focus in synthetic image generation. However, their requirement of large …
with a large focus in synthetic image generation. However, their requirement of large …
An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
A Abdulaal, H Fry, N Montaña-Brown… - arXiv preprint arXiv …, 2024 - arxiv.org
Radiological services are experiencing unprecedented demand, leading to increased
interest in automating radiology report generation. Existing Vision-Language Models (VLMs) …
interest in automating radiology report generation. Existing Vision-Language Models (VLMs) …
LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts
Z Wang, Y Sun, Z Li, X Yang, F Chen, H Liao - arXiv preprint arXiv …, 2024 - arxiv.org
Drafting radiology reports is a complex task requiring flexibility, where radiologists tail
content to available information and particular clinical demands. However, most current …
content to available information and particular clinical demands. However, most current …
M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation
The rapid evolution of artificial intelligence, especially in large language models (LLMs), has
significantly impacted various domains, including healthcare. In chest X-ray (CXR) analysis …
significantly impacted various domains, including healthcare. In chest X-ray (CXR) analysis …
Overcoming data scarcity in biomedical imaging with a foundational multi-task model
R Schäfer, T Nicke, H Höfener, A Lange… - Nature Computational …, 2024 - nature.com
Foundational models, pretrained on a large scale, have demonstrated substantial success
across non-medical domains. However, training these models typically requires large …
across non-medical domains. However, training these models typically requires large …