Unsupervised point cloud representation learning with deep neural networks: A survey

A Xiao, J Huang, D Guan, X Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Point cloud data have been widely explored due to its superior accuracy and robustness
under various adverse situations. Meanwhile, deep neural networks (DNNs) have achieved …

Towards global video scene segmentation with context-aware transformer

Y Yang, Y Huang, W Guo, B Xu, D Xia - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Videos such as movies or TV episodes usually need to divide the long storyline into
cohesive units, ie, scenes, to facilitate the understanding of video semantics. The key …

Meltr: Meta loss transformer for learning to fine-tune video foundation models

D Ko, J Choi, HK Choi, KW On… - Proceedings of the …, 2023 - openaccess.thecvf.com
Foundation models have shown outstanding performance and generalization capabilities
across domains. Since most studies on foundation models mainly focus on the pretraining …

Frame-wise action representations for long videos via sequence contrastive learning

M Chen, F Wei, C Li, D Cai - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com
Prior works on action representation learning mainly focus on designing various
architectures to extract the global representations for short video clips. In contrast, many …

Static and dynamic concepts for self-supervised video representation learning

R Qian, S Ding, X Liu, D Lin - European Conference on Computer Vision, 2022 - Springer
In this paper, we propose a novel learning scheme for self-supervised video representation
learning. Motivated by how humans understand videos, we propose to first learn general …

A electricity theft detection method through contrastive learning in smart grid

Z Liu, W Ding, T Chen, M Sun, H Cai, C Liu - EURASIP Journal on …, 2023 - Springer
As an important edge device of power grid, smart meters enable the detection of illegal
behaviors such as electricity theft by analyzing large-scale electricity consumption data …

Scene consistency representation learning for video scene segmentation

H Wu, K Chen, Y Luo, R Qiao, B Ren… - Proceedings of the …, 2022 - openaccess.thecvf.com
A long-term video, such as a movie or TV show, is composed of various scenes, each of
which represents a series of shots sharing the same semantic story. Spotting the correct …

Alignment-uniformity aware representation learning for zero-shot video classification

S Pu, K Zhao, M Zheng - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
Most methods tackle zero-shot video classification by aligning visual-semantic
representations within seen classes, which limits generalization to unseen classes. To …

Dual contrastive learning for spatio-temporal representation

S Ding, R Qian, H Xiong - Proceedings of the 30th ACM international …, 2022 - dl.acm.org
Contrastive learning has shown promising potential in self-supervised spatio-temporal
representation learning. Most works naively sample different clips to construct positive and …

Temporal augmented contrastive learning for micro-expression recognition

T Wang, L Shang - Pattern Recognition Letters, 2023 - Elsevier
Micro-expressions (MEs) can reveal the hidden but real emotion and are usually caused
spontaneously. However, the characteristics of subtlety and temporariness with the lack of …