Deconfounded visual grounding

J Huang, Y Qin, J Qi, Q Sun, H Zhang - Proceedings of the AAAI …, 2022 - ojs.aaai.org
We focus on the confounding bias between language and location in the visual grounding
pipeline, where we find that the bias is the major visual reasoning bottleneck. For example …

Cross-modality synergy network for referring expression comprehension and segmentation

Q Li, Y Zhang, S Sun, J Wu, X Zhao, M Tan - Neurocomputing, 2022 - Elsevier
Referring expression comprehension and segmentation aim to locate and segment a
referred instance in an image according to a natural language expression. However …

Bridging the gap between object detection and user intent via query-modulation

M Fornoni, C Yan, L Luo, K Wilber, A Stark… - arXiv preprint arXiv …, 2021 - arxiv.org
When interacting with objects through cameras, or pictures, users often have a specific
intent. For example, they may want to perform a visual search. With most object detection …

Platform-specific model compression for deep neural networks with joint methods

S Lin - 2020 - search.proquest.com
Deep learning has delivered its powerfulness in many application domains, especially in
computer vision, natural language processing and speech recognition. As the backbone of …