Multimodal prototype-enhanced network for few-shot action recognition

J Xing, M Wang, Y Ruan, B Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Class prototype construction and matching are core aspects of few-shot action recognition.
Previous methods mainly focus on designing spatiotemporal relation modeling modules or …

被引用次数：16 相关文章所有 5 个版本

[PDF] neurips.cc

Soc: Semantic-assisted object cluster for referring video object segmentation

Z Luo, Y Xiao, Y Liu, S Li, Y Wang… - Advances in …, 2024 - proceedings.neurips.cc

This paper studies referring video object segmentation (RVOS) by boosting video-level
visual-linguistic alignment. Recent approaches model the RVOS task as a sequence …

被引用次数：23 相关文章所有 5 个版本

[PDF] arxiv.org

A Comprehensive Review of Few-shot Action Recognition

Y Wanyan, X Yang, W Dong, C Xu - arXiv preprint arXiv:2407.14744, 2024 - arxiv.org

Few-shot action recognition aims to address the high cost and impracticality of manually
labeling complex and variable video data in action recognition. It requires accurately …

被引用次数：1 相关文章所有 3 个版本

Consistency Prototype Module and Motion Compensation for few-shot action recognition (CLIP-CPM2C)

F Guo, YK Wang, H Qi, L Zhu, J Sun - Neurocomputing, 2025 - Elsevier

Recently, few-shot action recognition has progressed significantly, as it has learned the
feature discriminability and designed suitable comparison methods. Still, there are the …

被引用次数：2 相关文章

[PDF] thecvf.com

Semantic-aware Video Representation for Few-shot Action Recognition

Y Tang, B Béjar, R Vidal - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com

Recent work on action recognition leverages 3D features and textual information to achieve
state-of-the-art performance. However, most of the current few-shot action recognition …

被引用次数：3 相关文章所有 5 个版本

[PDF] arxiv.org

Trajectory-aligned Space-time Tokens for Few-shot Action Recognition

P Kumar, N Padmanabhan, L Luo… - … on Computer Vision, 2025 - Springer

We propose a simple yet effective approach for few-shot action recognition, emphasizing the
disentanglement of motion and appearance representations. By harnessing recent progress …

Multi-view distillation based on multi-modal fusion for few-shot action recognition (CLIP-MDMF)

F Guo, YK Wang, H Qi, W Jin, L Zhu, J Sun - Knowledge-Based Systems, 2024 - Elsevier

In recent years, the field of few-shot action recognition (FSAR) has garnered significant
attention. Although many methods primarily rely on mono-modal data, there is a growing …

[PDF] arxiv.org

Fully Aligned Network for Referring Image Segmentation

Y Liu, R Xu, Y Tang - arXiv preprint arXiv:2409.19569, 2024 - arxiv.org

This paper focuses on the Referring Image Segmentation (RIS) task, which aims to segment
objects from an image based on a given language description. The critical problem of RIS is …

They Look Like Each Other: Case-based Reasoning for Explainable Depression Detection on Twitter using Large Language Models

MS Mahdavinejad, P Adibi, A Monadjemi… - arXiv preprint arXiv …, 2024 - arxiv.org

Depression is a common mental health issue that requires prompt diagnosis and treatment.
Despite the promise of social media data for depression detection, the opacity of employed …

[PDF] arxiv.org

Few-Shot Relation Extraction with Hybrid Visual Evidence

J Gong, H Eldardiry - arXiv preprint arXiv:2403.00724, 2024 - arxiv.org

The goal of few-shot relation extraction is to predict relations between name entities in a
sentence when only a few labeled instances are available for training. Existing few-shot …

被引用次数：1 相关文章所有 3 个版本