[HTML][HTML] Deep learning and transfer learning for device-free human activity recognition: A survey
Device-free activity recognition plays a crucial role in smart building, security, and human–
computer interaction, which shows its strength in its convenience and cost-efficiency …
computer interaction, which shows its strength in its convenience and cost-efficiency …
Deep learning for multiple object tracking: a survey
Deep learning has been proved effective in multiple object tracking, which confronts the
difficulties of frequent occlusions, confusing appearance, in‐and‐out objects, and lack of …
difficulties of frequent occlusions, confusing appearance, in‐and‐out objects, and lack of …
Targeted supervised contrastive learning for long-tailed recognition
Real-world data often exhibits long tail distributions with heavy class imbalance, where the
majority classes can dominate the training process and alter the decision boundaries of the …
majority classes can dominate the training process and alter the decision boundaries of the …
Raft: Recurrent all-pairs field transforms for optical flow
Z Teed, J Deng - Computer Vision–ECCV 2020: 16th European …, 2020 - Springer
Abstract We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network
architecture for optical flow. RAFT extracts per-pixel features, builds multi-scale 4D …
architecture for optical flow. RAFT extracts per-pixel features, builds multi-scale 4D …
X3d: Expanding architectures for efficient video recognition
C Feichtenhofer - Proceedings of the IEEE/CVF conference …, 2020 - openaccess.thecvf.com
This paper presents X3D, a family of efficient video networks that progressively expand a
tiny 2D image classification architecture along multiple network axes, in space, time, width …
tiny 2D image classification architecture along multiple network axes, in space, time, width …
Graph convolutional networks for temporal action localization
Most state-of-the-art action localization systems process each action proposal individually,
without explicitly exploiting their relations during learning. However, the relations between …
without explicitly exploiting their relations during learning. However, the relations between …
Deep multimodal fusion by channel exchanging
Deep multimodal fusion by using multiple sources of data for classification or regression has
exhibited a clear advantage over the unimodal counterpart on various applications. Yet …
exhibited a clear advantage over the unimodal counterpart on various applications. Yet …
Stm: Spatiotemporal and motion encoding for action recognition
Spatiotemporal and motion features are two complementary and crucial information for
video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn …
video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn …
A comprehensive study of deep video action recognition
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …
last decade, we have witnessed great advancements in video action recognition thanks to …
Quality assessment of in-the-wild videos
Quality assessment of in-the-wild videos is a challenging problem because of the absence
of reference videos and shooting distortions. Knowledge of the human visual system can …
of reference videos and shooting distortions. Knowledge of the human visual system can …