Unsupervised visual representation learning by tracking patches in video

P Bagad, M Tapaswi… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Modelling and understanding time remains a challenge in contemporary video
understanding models. With language emerging as a key driver towards powerful …

被引用次数：30 相关文章所有 9 个版本

[PDF] thecvf.com

Video contrastive learning with global context

H Kuang, Y Zhu, Z Zhang, X Li… - Proceedings of the …, 2021 - openaccess.thecvf.com

Contrastive learning has revolutionized the self-supervised image representation learning
field and recently been adapted to the video domain. One of the greatest advantages of …

被引用次数：70 相关文章所有 10 个版本

[PDF] thecvf.com

L-dawa: Layer-wise divergence aware weight aggregation in federated self-supervised visual representation learning

YAU Rehman, Y Gao… - Proceedings of the …, 2023 - openaccess.thecvf.com

The ubiquity of camera-enabled devices has led to large amounts of unlabeled image data
being produced at the edge. The integration of self-supervised learning (SSL) and federated …

被引用次数：12 相关文章所有 7 个版本

[PDF] thecvf.com

Masked motion encoding for self-supervised video representation learning

X Sun, P Chen, L Chen, C Li, TH Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

How to learn discriminative video representation from unlabeled videos is challenging but
crucial for video analysis. The latest attempts seek to learn a representation model by …

被引用次数：31 相关文章所有 8 个版本

A systematic literature review of visual feature learning: deep learning techniques, applications, challenges and future directions

M Abdullahi, ON Oyelade, AFD Kana… - Multimedia Tools and …, 2024 - Springer

Abstract Visual Feature Learning (VFL) is a critical area of research in computer vision that
involves the automatic extraction of features and patterns from images and videos. The …

被引用次数：1 相关文章

[PDF] thecvf.com

Transrank: Self-supervised video representation learning via ranking-based transformation recognition

H Duan, N Zhao, K Chen, D Lin - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Recognizing transformation types applied to a video clip (RecogTrans) is a long-established
paradigm for self-supervised video representation learning, which achieves much inferior …

被引用次数：23 相关文章所有 6 个版本

[PDF] ieee.org

Temporal action localization in the deep learning era: A survey

B Wang, Y Zhao, L Yang, T Long… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

The temporal action localization research aims to discover action instances from untrimmed
videos, representing a fundamental step in the field of intelligent video understanding. With …

被引用次数：11 相关文章所有 6 个版本

Self-supervised video representation learning using improved instance-wise contrastive learning and deep clustering

Y Zhu, H Shuai, G Liu, Q Liu - IEEE Transactions on Circuits …, 2022 - ieeexplore.ieee.org

Instance-wise contrastive learning (Instance-CL), which learns to map similar instances
closer and different instances farther apart in the embedding space, has achieved …

被引用次数：28 相关文章

[PDF] aaai.org

Self-supervised spatiotemporal representation learning by exploiting video continuity

H Liang, N Quader, Z Chi, L Chen, P Dai, J Lu… - Proceedings of the …, 2022 - ojs.aaai.org

Recent self-supervised video representation learning methods have found significant
success by exploring essential properties of videos, eg speed, temporal order, etc. This work …

被引用次数：26 相关文章所有 8 个版本

[PDF] thecvf.com

Pose-based contrastive learning for domain agnostic activity representations

D Schneider, S Sarfraz, A Roitberg… - Proceedings of the …, 2022 - openaccess.thecvf.com

While recognition accuracies of video classification models trained on conventional
benchmarks are gradually saturating, recent studies raise alarm about the learned …

被引用次数：14 相关文章所有 4 个版本