Vision-language models for medical report generation and visual question answering: A review

I Hartsock, G Rasool - Frontiers in Artificial Intelligence, 2024 - frontiersin.org
Medical vision-language models (VLMs) combine computer vision (CV) and natural
language processing (NLP) to analyze visual and textual medical data. Our paper reviews …

Vision transformer architecture and applications in digital health: a tutorial and survey

K Al-Hammuri, F Gebali, A Kanan… - Visual computing for …, 2023 - Springer
The vision transformer (ViT) is a state-of-the-art architecture for image recognition tasks that
plays an important role in digital health applications. Medical images account for 90% of the …

Rendezvous: Attention mechanisms for the recognition of surgical action triplets in endoscopic videos

CI Nwoye, T Yu, C Gonzalez, B Seeliger… - Medical Image …, 2022 - Elsevier
Out of all existing frameworks for surgical workflow analysis in endoscopic videos, action
triplet recognition stands out as the only one aiming to provide truly fine-grained and …

Myocardial involvement after hospitalization for COVID-19 complicated by troponin elevation: a prospective, multicenter, observational study

J Artico, H Shiwani, JC Moon, M Gorecka, GP McCann… - Circulation, 2023 - Am Heart Assoc
Background: Acute myocardial injury in hospitalized patients with coronavirus disease 2019
(COVID-19) has a poor prognosis. Its associations and pathogenesis are unclear. Our aim …

Cholectriplet2021: A benchmark challenge for surgical action triplet recognition

CI Nwoye, D Alapatt, T Yu, A Vardazaryan, F Xia… - Medical Image …, 2023 - Elsevier
Context-aware decision support in the operating room can foster surgical safety and
efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing …

CholecTriplet2022: Show me a tool and tell me the triplet—An endoscopic vision challenge for surgical action triplet detection

CI Nwoye, T Yu, S Sharma, A Murali, D Alapatt… - Medical Image …, 2023 - Elsevier
Formalizing surgical activities as triplets of the used instruments, actions performed, and
target anatomies is becoming a gold standard approach for surgical activity modeling. The …

Video-instrument synergistic network for referring video instrument segmentation in robotic surgery

H Wang, G Yang, S Zhang, J Qin, Y Guo… - … on Medical Imaging, 2024 - ieeexplore.ieee.org
Surgical instrument segmentation is fundamentally important for facilitating cognitive
intelligence in robot-assisted surgery. Although existing methods have achieved accurate …

Polybot: Training one policy across robots while embracing variability

J Yang, D Sadigh, C Finn - arXiv preprint arXiv:2307.03719, 2023 - arxiv.org
Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday
scenarios due to the high cost of collecting robotic datasets. However, robotic platforms …

Class-incremental domain adaptation with smoothing and calibration for surgical report generation

M Xu, M Islam, CM Lim, H Ren - … , France, September 27–October 1, 2021 …, 2021 - Springer
Generating surgical reports aimed at surgical scene understanding in robot-assisted surgery
can contribute to documenting entry tasks and post-operative analysis. Despite the …

Curriculum-based augmented fourier domain adaptation for robust medical image segmentation

A Wang, M Islam, M Xu, H Ren - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Accurate and robust medical image segmentation is fundamental and crucial for enhancing
the autonomy of computer-aided diagnosis and intervention systems. Medical data …