Unified transformer tracker for object tracking

X Chen, H Peng, D Wang, H Lu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

In this paper, we present a new sequence-to-sequence learning framework for visual
tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem …

被引用次数：141 相关文章所有 5 个版本

[PDF] thecvf.com

Mixformer: End-to-end tracking with iterative mixed attention

Y Cui, C Jiang, L Wang, G Wu - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Tracking often uses a multi-stage pipeline of feature extraction, target information
integration, and bounding box estimation. To simplify this pipeline and unify the process of …

被引用次数：493 相关文章所有 11 个版本

[PDF] thecvf.com

Integrating boxes and masks: A multi-object framework for unified visual tracking and segmentation

Y Xu, Z Yang, Y Yang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

Tracking any given object (s) spatially and temporally is a common purpose in Visual Object
Tracking (VOT) and Video Object Segmentation (VOS). Joint tracking and segmentation …

被引用次数：7 相关文章所有 6 个版本

[PDF] thecvf.com

Onetracker: Unifying visual object tracking with foundation models and efficient tuning

L Hong, S Yan, R Zhang, W Li, X Zhou… - Proceedings of the …, 2024 - openaccess.thecvf.com

Visual object tracking aims to localize the target object of each frame based on its initial
appearance in the first frame. Depending on the input modility tracking tasks can be divided …

被引用次数：16 相关文章所有 3 个版本

[PDF] thecvf.com

DiffusionTrack: Point Set Diffusion Model for Visual Object Tracking

F Xie, Z Wang, C Ma - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Existing Siamese or transformer trackers commonly pose visual object tracking as a one-
shot detection problem ie locating the target object in a single forward evaluation scheme …

被引用次数：7 相关文章

[PDF] techscience.cn

[PDF][PDF] Convolution-Transformer for Image Feature Extraction.

L Yin, L Wang, S Lu, R Wang, Y Yang… - … in Engineering & …, 2024 - cdn.techscience.cn

This study addresses the limitations of Transformer models in image feature extraction,
particularly their lack of inductive bias for visual structures. Compared to Convolutional …

被引用次数：12 相关文章所有 2 个版本

[PDF] ieee.org

Transformers in single object tracking: an experimental survey

J Kugarajeevan, T Kokul, A Ramanan… - IEEE Access, 2023 - ieeexplore.ieee.org

Single-object tracking is a well-known and challenging research topic in computer vision.
Over the last two decades, numerous researchers have proposed various algorithms to …

被引用次数：38 相关文章所有 5 个版本

[PDF] mdpi.com

Intelligent video analytics for human action recognition: the state of knowledge

M Kulbacki, J Segen, Z Chaczko, JW Rozenblit… - Sensors, 2023 - mdpi.com

The paper presents a comprehensive overview of intelligent video analytics and human
action recognition methods. The article provides an overview of the current state of …

被引用次数：6 相关文章所有 13 个版本

[PDF] neurips.cc

Single-stage visual query localization in egocentric videos

H Jiang, SK Ramakrishnan… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Visual Query Localization on long-form egocentric videos requires spatio-temporal
search and localization of visually specified objects and is vital to build episodic memory …

被引用次数：6 相关文章所有 6 个版本

[PDF] thecvf.com

Correlational image modeling for self-supervised visual pre-training

W Li, J Xie, CC Loy - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com

Abstract We introduce Correlational Image Modeling (CIM), a novel but surprisingly effective
approach to self-supervised visual pre-training. Our CIM performs a simple pretext task: we …

被引用次数：7 相关文章所有 8 个版本