A review on visual privacy preservation techniques for active and assisted living

S Ravi, P Climent-Pérez, F Florez-Revuelta - Multimedia Tools and …, 2024 - Springer
This paper reviews the state of the art in visual privacy protection techniques, with particular
attention paid to techniques applicable to the field of Active and Assisted Living (AAL). A …

Speednet: Learning the speediness in videos

S Benaim, A Ephrat, O Lang, I Mosseri… - Proceedings of the …, 2020 - openaccess.thecvf.com
We wish to automatically predict the" speediness" of moving objects in videos-whether they
move faster, at, or slower than their" natural" speed. The core component in our approach is …

Self-supervised video transformer

K Ranasinghe, M Naseer, S Khan… - Proceedings of the …, 2022 - openaccess.thecvf.com
In this paper, we propose self-supervised training for video transformers using unlabeled
video data. From a given video, we create local and global spatiotemporal views with …

Spoken moments: Learning joint audio-visual representations from video descriptions

M Monfort, SY Jin, A Liu, D Harwath… - Proceedings of the …, 2021 - openaccess.thecvf.com
When people observe events, they are able to abstract key information and build concise
summaries of what is happening. These summaries include contextual and semantic …

How transferable are video representations based on synthetic data?

Y Kim, S Mishra, SY Jin, R Panda… - Advances in …, 2022 - proceedings.neurips.cc
Action recognition has improved dramatically with massive-scale video datasets. Yet, these
datasets are accompanied with issues related to curation cost, privacy, ethics, bias, and …

Efficient video classification using fewer frames

S Bhardwaj, M Srinivasan… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Recently, there has been a lot of interest in building compact models for video classification
which have a small memory footprint (< 1 GB). While these models are compact, they …

Masking modalities for cross-modal video retrieval

V Gabeur, A Nagrani, C Sun… - Proceedings of the …, 2022 - openaccess.thecvf.com
Pre-training on large scale unlabelled datasets has shown impressive performance
improvements in the fields of computer vision and natural language processing. Given the …

Alignment-uniformity aware representation learning for zero-shot video classification

S Pu, K Zhao, M Zheng - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
Most methods tackle zero-shot video classification by aligning visual-semantic
representations within seen classes, which limits generalization to unseen classes. To …

Cross-modal generalization: Learning in low resource modalities via meta-alignment

PP Liang, P Wu, L Ziyin, LP Morency… - Proceedings of the 29th …, 2021 - dl.acm.org
How can we generalize to a new prediction task at test time when it also uses a new
modality as input? More importantly, how can we do this with as little annotated data as …

Scalable and accurate self-supervised multimodal representation learning without aligned video and text data

V Lialin, S Rawls, D Chan, S Ghosh… - Proceedings of the …, 2023 - openaccess.thecvf.com
Scaling up weakly-supervised datasets has shown to be highly effective in the image-text
domain and has contributed to most of the recent state-of-the-art computer vision and …