A review on visual privacy preservation techniques for active and assisted living
This paper reviews the state of the art in visual privacy protection techniques, with particular
attention paid to techniques applicable to the field of Active and Assisted Living (AAL). A …
attention paid to techniques applicable to the field of Active and Assisted Living (AAL). A …
Speednet: Learning the speediness in videos
We wish to automatically predict the" speediness" of moving objects in videos-whether they
move faster, at, or slower than their" natural" speed. The core component in our approach is …
move faster, at, or slower than their" natural" speed. The core component in our approach is …
Self-supervised video transformer
In this paper, we propose self-supervised training for video transformers using unlabeled
video data. From a given video, we create local and global spatiotemporal views with …
video data. From a given video, we create local and global spatiotemporal views with …
Spoken moments: Learning joint audio-visual representations from video descriptions
When people observe events, they are able to abstract key information and build concise
summaries of what is happening. These summaries include contextual and semantic …
summaries of what is happening. These summaries include contextual and semantic …
How transferable are video representations based on synthetic data?
Action recognition has improved dramatically with massive-scale video datasets. Yet, these
datasets are accompanied with issues related to curation cost, privacy, ethics, bias, and …
datasets are accompanied with issues related to curation cost, privacy, ethics, bias, and …
Efficient video classification using fewer frames
S Bhardwaj, M Srinivasan… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Recently, there has been a lot of interest in building compact models for video classification
which have a small memory footprint (< 1 GB). While these models are compact, they …
which have a small memory footprint (< 1 GB). While these models are compact, they …
Masking modalities for cross-modal video retrieval
Pre-training on large scale unlabelled datasets has shown impressive performance
improvements in the fields of computer vision and natural language processing. Given the …
improvements in the fields of computer vision and natural language processing. Given the …
Alignment-uniformity aware representation learning for zero-shot video classification
Most methods tackle zero-shot video classification by aligning visual-semantic
representations within seen classes, which limits generalization to unseen classes. To …
representations within seen classes, which limits generalization to unseen classes. To …
Cross-modal generalization: Learning in low resource modalities via meta-alignment
How can we generalize to a new prediction task at test time when it also uses a new
modality as input? More importantly, how can we do this with as little annotated data as …
modality as input? More importantly, how can we do this with as little annotated data as …
Scalable and accurate self-supervised multimodal representation learning without aligned video and text data
Scaling up weakly-supervised datasets has shown to be highly effective in the image-text
domain and has contributed to most of the recent state-of-the-art computer vision and …
domain and has contributed to most of the recent state-of-the-art computer vision and …