Winner: Weakly-supervised hierarchical decomposition and alignment for spatio-temporal video...

W Zhang, Z Lv, H Zhou, JW Liu, J Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Active Domain Adaptation (ADA) aims to maximally boost model adaptation in a
new target domain by actively selecting a limited number of target data to annotate. This …

被引用次数：11 相关文章所有 3 个版本

[PDF] thecvf.com

Are binary annotations sufficient? video moment retrieval via hierarchical uncertainty-based active learning

W Ji, R Liang, Z Zheng, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent research on video moment retrieval has mostly focused on enhancing the
performance of accuracy, efficiency, and robustness, all of which largely rely on the …

被引用次数：23 相关文章所有 7 个版本

[PDF] aaai.org

Panoptic scene graph generation with semantics-prototype learning

L Li, W Ji, Y Wu, M Li, Y Qin, L Wei… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Panoptic Scene Graph Generation (PSG) parses objects and predicts their relationships
(predicate) to connect human language and visual scenes. However, different language …

被引用次数：18 相关文章所有 3 个版本

[PDF] researchgate.net

Intelligent model update strategy for sequential recommendation

Z Lv, W Zhang, Z Chen, S Zhang, K Kuang - Proceedings of the ACM on …, 2024 - dl.acm.org

Modern online platforms are increasingly employing recommendation systems to address
information overload and improve user engagement. There is an evolving paradigm in this …

被引用次数：15 相关文章所有 3 个版本

[PDF] thecvf.com

Gradient-regulated meta-prompt learning for generalizable vision-language models

J Li, M Gao, L Wei, S Tang, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-
training models to adapt to downstream tasks in a parameter-and data-efficient way, by …

被引用次数：20 相关文章所有 5 个版本

[PDF] aclanthology.org

Multi-modal action chain abductive reasoning

M Li, T Wang, J Xu, K Han, S Zhang… - Proceedings of the …, 2023 - aclanthology.org

Abductive Reasoning, has long been considered to be at the core ability of humans, which
enables us to infer the most plausible explanation of incomplete known phenomena in daily …

被引用次数：10 相关文章所有 2 个版本

[PDF] thecvf.com

Hig: Hierarchical interlacement graph approach to scene graph generation in video understanding

TT Nguyen, P Nguyen, K Luu - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Visual interactivity understanding within visual scenes presents a significant challenge in
computer vision. Existing methods focus on complex interactivities while leveraging a simple …

被引用次数：6 相关文章所有 6 个版本

[PDF] thecvf.com

Learning in imperfect environment: Multi-label classification with long-tailed distribution and partial labels

W Zhang, C Liu, L Zeng, B Ooi… - Proceedings of the …, 2023 - openaccess.thecvf.com

Conventional multi-label classification (MLC) methods assume that all samples are fully
labeled and identically distributed. Unfortunately, this assumption is unrealistic in large …

被引用次数：11 相关文章所有 6 个版本

Efficient long-short temporal attention network for unsupervised video object segmentation

P Li, Y Zhang, L Yuan, H Xiao, B Lin, X Xu - Pattern Recognition, 2024 - Elsevier

Abstract Unsupervised Video Object Segmentation (VOS) aims at identifying the contours of
primary foreground objects in videos without any prior knowledge. However, previous …

被引用次数：12 相关文章所有 3 个版本

Unsupervised domain adaptation for video object grounding with cascaded debiasing learning

M Li, H Zhang, J Li, Z Zhao, W Zhang, S Zhang… - Proceedings of the 31st …, 2023 - dl.acm.org

This paper addresses the Unsupervised Domain Adaptation (UDA) for the dense frame
prediction task-Video Object Grounding (VOG). This investigation springs from the …

被引用次数：6 相关文章