Egocentric video task translation

Z Xue, Y Song, K Grauman… - Proceedings of the …, 2023 - openaccess.thecvf.com
Different video understanding tasks are typically treated in isolation, and even with distinct
types of curated data (eg, classifying sports in one dataset, tracking animals in another) …

Egocentric video task translation@ ego4d challenge 2022

Z Xue, Y Song, K Grauman, L Torresani - arXiv preprint arXiv:2302.01891, 2023 - arxiv.org
This technical report describes the EgoTask Translation approach that explores relations
among a set of egocentric video tasks in the Ego4D challenge. To improve the primary task …

Efficient video representation learning via motion-aware token selection

S Hwang, J Yoon, Y Lee, SJ Hwang - arXiv preprint arXiv:2211.10636, 2022 - arxiv.org
Recently emerged Masked Video Modeling techniques demonstrated their potential by
significantly outperforming previous methods in self-supervised learning for video. However …

Masked Autoencoders for Egocentric Video Understanding@ Ego4D Challenge 2022

J Lei, S Ma, Z Ba, S Vemprala, A Kapoor… - arXiv preprint arXiv …, 2022 - arxiv.org
In this report, we present our approach and empirical results of applying masked
autoencoders in two egocentric video understanding tasks, namely, Object State Change …

EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens

S Hwang, J Yoon, Y Lee, SJ Hwang - openreview.net
Masked video autoencoder approaches have demonstrated their potential by significantly
outperforming previous self-supervised learning methods in video representation learning …