Privacy preserving visual question answering- 学术资源搜索

文章

学术资源搜索

Privacy preserving visual question answering

CP Bara, Q Ping, A Mathur, G Thattai, R MV… - arXiv preprint arXiv …, 2022 - arxiv.org

CP Bara, Q Ping, A Mathur, G Thattai, R MV, GS Sukhatme

arXiv preprint arXiv:2202.07712, 2022•arxiv.org

We introduce a novel privacy-preserving methodology for performing Visual Question Answering on the edge. Our method constructs a symbolic representation of the visual scene, using a low-complexity computer vision model that jointly predicts classes, attributes and predicates. This symbolic representation is non-differentiable, which means it cannot be used to recover the original image, thereby keeping the original image private. Our proposed hybrid solution uses a vision model which is more than 25 times smaller than the current state-of-the-art (SOTA) vision models, and 100 times smaller than end-to-end SOTA VQA models. We report detailed error analysis and discuss the trade-offs of using a distilled vision model and a symbolic representation of the visual scene.

arxiv.org

展开收起

被引用次数：3 相关文章所有 7 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果