Vision transformers for action recognition: A survey
Vision transformers are emerging as a powerful tool to solve computer vision problems.
Recent techniques have also proven the efficacy of transformers beyond the image domain …
Recent techniques have also proven the efficacy of transformers beyond the image domain …
Ms-tct: Multi-scale temporal convtransformer for action detection
Action detection is an essential and challenging task, especially for densely labelled
datasets of untrimmed videos. The temporal relation is complex in those datasets, including …
datasets of untrimmed videos. The temporal relation is complex in those datasets, including …
Unimd: Towards unifying moment retrieval and temporal action detection
Abstract Temporal Action Detection (TAD) focuses on detecting pre-defined actions, while
Moment Retrieval (MR) aims to identify the events described by open-ended natural …
Moment Retrieval (MR) aims to identify the events described by open-ended natural …
Token turing machines
MS Ryoo, K Gopalakrishnan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We propose Token Turing Machines (TTM), a sequential, autoregressive
Transformer model with memory for real-world sequential visual understanding. Our model …
Transformer model with memory for real-world sequential visual understanding. Our model …
Does self-supervised learning really improve reinforcement learning from pixels?
We investigate whether self-supervised learning (SSL) can improve online reinforcement
learning (RL) from pixels. We extend the contrastive reinforcement learning framework (eg …
learning (RL) from pixels. We extend the contrastive reinforcement learning framework (eg …
Two-dimensional and three-dimensional CNN-based simultaneous detection and activity classification of construction workers
The type and duration of construction workers' activities are useful information for project
management purposes. Therefore, several studies have used surveillance cameras and …
management purposes. Therefore, several studies have used surveillance cameras and …
Adafocus: Towards end-to-end weakly supervised learning for long-video action understanding
Developing end-to-end models for long-video action understanding tasks presents
significant computational and memory challenges. Existing works generally build models on …
significant computational and memory challenges. Existing works generally build models on …
Aan: Attributes-aware network for temporal action detection
The challenge of long-term video understanding remains constrained by the efficient
extraction of object semantics and the modelling of their relationships for downstream tasks …
extraction of object semantics and the modelling of their relationships for downstream tasks …
Test-Time Mixup Augmentation for Data and Class-Dependent Uncertainty Estimation in Deep Learning Image Classification
Uncertainty estimation of the trained deep learning networks is valuable for optimizing
learning efficiency and evaluating the reliability of network predictions. In this paper, we …
learning efficiency and evaluating the reliability of network predictions. In this paper, we …
Productivity Monitoring of Construction Workers Based on Spatiotemporal Activity Recognition
G Torabi - 2022 - spectrum.library.concordia.ca
Workers' productivity monitoring is an essential but time-consuming part of large
construction projects. Therefore, automating this process using surveillance cameras has …
construction projects. Therefore, automating this process using surveillance cameras has …