作者
Kun Yuan, Manasi Kattel, Joël L Lavanchy, Nassir Navab, Vinkle Srivastav, Nicolas Padoy
发表日期
2024/5/23
期刊
International Journal of Computer Assisted Radiology and Surgery
页码范围
1-9
出版商
Springer International Publishing
简介
Purpose
The modern operating room is becoming increasingly complex, requiring innovative intra-operative support systems. While the focus of surgical data science has largely been on video analysis, integrating surgical computer vision with natural language capabilities is emerging as a necessity. Our work aims to advance visual question answering (VQA) in the surgical context with scene graph knowledge, addressing two main challenges in the current surgical VQA systems: removing question–condition bias in the surgical VQA dataset and incorporating scene-aware reasoning in the surgical VQA model design.
Methods
First, we propose a surgical scene graph-based dataset, SSG-VQA, generated by employing segmentation and detection models on publicly available datasets. We build surgical scene graphs using spatial and action information of instruments and anatomies. These graphs are fed into a …
引用总数
学术搜索中的文章
K Yuan, M Kattel, JL Lavanchy, N Navab, V Srivastav… - International Journal of Computer Assisted Radiology …, 2024