A review of convolutional neural network architectures and their optimizations
The research advances concerning the typical architectures of convolutional neural
networks (CNNs) as well as their optimizations are analyzed and elaborated in detail in this …
networks (CNNs) as well as their optimizations are analyzed and elaborated in detail in this …
Moviechat: From dense token to sparse memory for long video understanding
Recently integrating video foundation models and large language models to build a video
understanding system can overcome the limitations of specific pre-defined vision tasks. Yet …
understanding system can overcome the limitations of specific pre-defined vision tasks. Yet …
Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model
HK Cheng, AG Schwing - European Conference on Computer Vision, 2022 - Springer
We present XMem, a video object segmentation architecture for long videos with unified
feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video …
feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video …
Aiatrack: Attention in attention for transformer visual tracking
Transformer trackers have achieved impressive advancements recently, where the attention
mechanism plays an important role. However, the independent correlation computation in …
mechanism plays an important role. However, the independent correlation computation in …
Putting the object back into video object segmentation
We present Cutie a video object segmentation (VOS) network with object-level memory
reading which puts the object representation from memory back into the video object …
reading which puts the object representation from memory back into the video object …
Boosting video object segmentation via space-time correspondence learning
Current top-leading solutions for video object segmentation (VOS) typically follow a
matching-based regime: for each query frame, the segmentation mask is inferred according …
matching-based regime: for each query frame, the segmentation mask is inferred according …
Hierarchical feature alignment network for unsupervised video object segmentation
Optical flow is an easily conceived and precious cue for advancing unsupervised video
object segmentation (UVOS). Most of the previous methods directly extract and fuse the …
object segmentation (UVOS). Most of the previous methods directly extract and fuse the …
Recurrent dynamic embedding for video object segmentation
Abstract Space-time memory (STM) based video object segmentation (VOS) networks
usually keep increasing memory bank every several frames, which shows excellent …
usually keep increasing memory bank every several frames, which shows excellent …
Per-clip video object segmentation
Recently, memory-based approaches show promising results on semi-supervised video
object segmentation. These methods predict object masks frame-by-frame with the help of …
object segmentation. These methods predict object masks frame-by-frame with the help of …
Scalable video object segmentation with simplified framework
The current popular methods for video object segmentation (VOS) implement feature
matching through several hand-crafted modules that separately perform feature extraction …
matching through several hand-crafted modules that separately perform feature extraction …