Deep reinforcement learning in computer vision: a comprehensive survey
Deep reinforcement learning augments the reinforcement learning framework and utilizes
the powerful representation of deep neural networks. Recent works have demonstrated the …
the powerful representation of deep neural networks. Recent works have demonstrated the …
Applications of convolutional neural networks for intelligent waste identification and recycling: A review
With the implementations of “Zero Waste” and Industry 4.0, the rapidly increasing
applications of artificial intelligence in waste management have generated a large amount of …
applications of artificial intelligence in waste management have generated a large amount of …
Mixformer: End-to-end tracking with iterative mixed attention
Tracking often uses a multi-stage pipeline of feature extraction, target information
integration, and bounding box estimation. To simplify this pipeline and unify the process of …
integration, and bounding box estimation. To simplify this pipeline and unify the process of …
Seqtrack: Sequence to sequence learning for visual object tracking
In this paper, we present a new sequence-to-sequence learning framework for visual
tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem …
tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem …
Joint feature learning and relation modeling for tracking: A one-stream framework
The current popular two-stream, two-stage tracking framework extracts the template and the
search region features separately and then performs relation modeling, thus the extracted …
search region features separately and then performs relation modeling, thus the extracted …
Universal instance perception as object discovery and retrieval
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …
as category names, language expressions, and target annotations, but this complete field …
Visual prompt multi-modal tracking
Visible-modal object tracking gives rise to a series of downstream multi-modal tracking
tributaries. To inherit the powerful representations of the foundation model, a natural modus …
tributaries. To inherit the powerful representations of the foundation model, a natural modus …
Aiatrack: Attention in attention for transformer visual tracking
Transformer trackers have achieved impressive advancements recently, where the attention
mechanism plays an important role. However, the independent correlation computation in …
mechanism plays an important role. However, the independent correlation computation in …
Transforming model prediction for tracking
Optimization based tracking methods have been widely successful by integrating a target
model prediction module, providing effective global reasoning by minimizing an objective …
model prediction module, providing effective global reasoning by minimizing an objective …
Ego4d: Around the world in 3,000 hours of egocentric video
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It
offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …
offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …