Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging

S Azizi, L Culp, J Freyberg, B Mustafa, S Baur… - Nature Biomedical …, 2023 - nature.com
Abstract Machine-learning models for medical tasks can match or surpass the performance
of clinical experts. However, in settings differing from those of the training dataset, the …

Making the most of text semantics to improve biomedical vision–language processing

B Boecking, N Usuyama, S Bannur, DC Castro… - European conference on …, 2022 - Springer
Multi-modal data abounds in biomedicine, such as radiology images and reports.
Interpreting this data at scale is essential for improving clinical care and accelerating clinical …

Medklip: Medical knowledge enhanced language-image pre-training for x-ray diagnosis

C Wu, X Zhang, Y Zhang, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we consider enhancing medical visual-language pre-training (VLP) with
domain-specific knowledge, by exploiting the paired image-text reports from the radiological …

Multimodal co-learning: Challenges, applications with datasets, recent advances and future directions

A Rahate, R Walambe, S Ramanna, K Kotecha - Information Fusion, 2022 - Elsevier
Multimodal deep learning systems that employ multiple modalities like text, image, audio,
video, etc., are showing better performance than individual modalities (ie, unimodal) …

WRENCH: A comprehensive benchmark for weak supervision

J Zhang, Y Yu, Y Li, Y Wang, Y Yang, M Yang… - arXiv preprint arXiv …, 2021 - arxiv.org
Recent Weak Supervision (WS) approaches have had widespread success in easing the
bottleneck of labeling training data for machine learning by synthesizing labels from multiple …

Robust and efficient medical imaging with self-supervision

S Azizi, L Culp, J Freyberg, B Mustafa, S Baur… - arXiv preprint arXiv …, 2022 - arxiv.org
Recent progress in Medical Artificial Intelligence (AI) has delivered systems that can reach
clinical expert level performance. However, such systems tend to demonstrate sub-optimal" …

Data valuation for medical imaging using Shapley value and application to a large-scale chest X-ray dataset

S Tang, A Ghorbani, R Yamashita, S Rehman… - Scientific reports, 2021 - nature.com
The reliability of machine learning models can be compromised when trained on low quality
data. Many large-scale medical imaging datasets contain low quality labels extracted from …

Ontology-driven weak supervision for clinical entity classification in electronic health records

JA Fries, E Steinberg, S Khattar, SL Fleming… - Nature …, 2021 - nature.com
In the electronic health record, using clinical notes to identify entities such as disorders and
their temporality (eg the order of an event relative to a time index) can inform many important …