[HTML][HTML] Deep learning in computer vision: A critical review of emerging techniques and application scenarios
Deep learning has been overwhelmingly successful in computer vision (CV), natural
language processing, and video/speech recognition. In this paper, our focus is on CV. We …
language processing, and video/speech recognition. In this paper, our focus is on CV. We …
DeepThink IoT: the strength of deep learning in internet of things
Abstract The integration of Deep Learning (DL) and the Internet of Things (IoT) has
revolutionized technology in the twenty-first century, enabling humans and machines to …
revolutionized technology in the twenty-first century, enabling humans and machines to …
Temporal collection and distribution for referring video object segmentation
Referring video object segmentation aims to segment a referent throughout a video
sequence according to a natural language expression. It requires aligning the natural …
sequence according to a natural language expression. It requires aligning the natural …
Cross-modal progressive comprehension for referring segmentation
Given a natural language expression and an image/video, the goal of referring
segmentation is to produce the pixel-level masks of the entities described by the subject of …
segmentation is to produce the pixel-level masks of the entities described by the subject of …
Instance-specific feature propagation for referring segmentation
Referring segmentation aims to generate a segmentation mask for the target instance
indicated by a natural language expression. There are typically two kinds of existing …
indicated by a natural language expression. There are typically two kinds of existing …
Phraseclick: toward achieving flexible interactive segmentation by phrase and click
Existing interactive object segmentation methods mainly take spatial interactions such as
bounding boxes or clicks as input. However, these interactions do not contain information …
bounding boxes or clicks as input. However, these interactions do not contain information …
MFGNet: Dynamic modality-aware filter generation for RGB-T tracking
Many RGB-T trackers attempt to attain robust feature representation by utilizing an adaptive
weighting scheme (or attention mechanism). Different from these works, we propose a new …
weighting scheme (or attention mechanism). Different from these works, we propose a new …
An improved deep learning architecture for multi-object tracking systems
J Urdiales, D Martín… - Integrated Computer-Aided …, 2023 - content.iospress.com
Robust and reliable 3D multi-object tracking (MOT) is essential for autonomous driving in
crowded urban road scenes. In those scenarios, accurate data association between tracked …
crowded urban road scenes. In those scenarios, accurate data association between tracked …
Multiple relational learning network for joint referring expression comprehension and segmentation
Multi-task learning is a successful learning framework which improves the performance of
prediction models by leveraging knowledge among related tasks. Referring expression …
prediction models by leveraging knowledge among related tasks. Referring expression …
Panoptic narrative grounding
Abstract This paper proposes Panoptic Narrative Grounding, a spatially fine and general
formulation of the natural language visual grounding problem. We establish an experimental …
formulation of the natural language visual grounding problem. We establish an experimental …