[HTML][HTML] Deep learning in computer vision: A critical review of emerging techniques and application scenarios

J Chai, H Zeng, A Li, EWT Ngai - Machine Learning with Applications, 2021 - Elsevier
Deep learning has been overwhelmingly successful in computer vision (CV), natural
language processing, and video/speech recognition. In this paper, our focus is on CV. We …

DeepThink IoT: the strength of deep learning in internet of things

D Thakur, JK Saini, S Srinivasan - Artificial Intelligence Review, 2023 - Springer
Abstract The integration of Deep Learning (DL) and the Internet of Things (IoT) has
revolutionized technology in the twenty-first century, enabling humans and machines to …

Temporal collection and distribution for referring video object segmentation

J Tang, G Zheng, S Yang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Referring video object segmentation aims to segment a referent throughout a video
sequence according to a natural language expression. It requires aligning the natural …

Cross-modal progressive comprehension for referring segmentation

S Liu, T Hui, S Huang, Y Wei, B Li… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Given a natural language expression and an image/video, the goal of referring
segmentation is to produce the pixel-level masks of the entities described by the subject of …

Instance-specific feature propagation for referring segmentation

C Liu, X Jiang, H Ding - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org
Referring segmentation aims to generate a segmentation mask for the target instance
indicated by a natural language expression. There are typically two kinds of existing …

Phraseclick: toward achieving flexible interactive segmentation by phrase and click

H Ding, S Cohen, B Price, X Jiang - … , Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
Existing interactive object segmentation methods mainly take spatial interactions such as
bounding boxes or clicks as input. However, these interactions do not contain information …

MFGNet: Dynamic modality-aware filter generation for RGB-T tracking

X Wang, X Shu, S Zhang, B Jiang… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
Many RGB-T trackers attempt to attain robust feature representation by utilizing an adaptive
weighting scheme (or attention mechanism). Different from these works, we propose a new …

An improved deep learning architecture for multi-object tracking systems

J Urdiales, D Martín… - Integrated Computer-Aided …, 2023 - content.iospress.com
Robust and reliable 3D multi-object tracking (MOT) is essential for autonomous driving in
crowded urban road scenes. In those scenarios, accurate data association between tracked …

Multiple relational learning network for joint referring expression comprehension and segmentation

G Hua, M Liao, S Tian, Y Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Multi-task learning is a successful learning framework which improves the performance of
prediction models by leveraging knowledge among related tasks. Referring expression …

Panoptic narrative grounding

C González, N Ayobi, I Hernández… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract This paper proposes Panoptic Narrative Grounding, a spatially fine and general
formulation of the natural language visual grounding problem. We establish an experimental …