Fine-grained visual–text prompt-driven self-training for open-vocabulary object detection

J Zhang, J Huang, S Jin, S Lu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …

被引用次数：375 相关文章所有 9 个版本

[PDF] arxiv.org

A survey on open-vocabulary detection and segmentation: Past, present, and future

C Zhu, L Chen - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org

As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …

被引用次数：21 相关文章所有 7 个版本

An incremental-self-training-guided semi-supervised broad learning system

J Guo, Z Liu, CLP Chen - IEEE Transactions on Neural …, 2024 - ieeexplore.ieee.org

The broad learning system (BLS) has recently been applied in numerous fields. However, it
is mainly a supervised learning system and thus not suitable for specific practical …

被引用次数：6 相关文章

[PDF] acm.org

Text-prompt Camouflaged Instance Segmentation with Graduated Camouflage Learning

Z He, C Xia, S Qiao, J Li - Proceedings of the 32nd ACM International …, 2024 - dl.acm.org

Camouflaged instance segmentation (CIS) aims to detect and segment objects blending
with their surroundings. While existing CIS methods rely heavily on fully-supervised training …

被引用次数：1 相关文章所有 2 个版本

[PDF] ecva.net

Non-exemplar Domain Incremental Learning via Cross-Domain Concept Integration

Q Wang, Y He, S Dong, X Gao, S Wang… - European Conference on …, 2025 - Springer

Abstract Existing approaches to Domain Incremental Learning (DIL) address catastrophic
forgetting by storing and rehearsing exemplars from old domains. However, exemplar-based …

Proactive schemes: A survey of adversarial attacks for social good

V Asnani, X Yin, X Liu - arXiv preprint arXiv:2409.16491, 2024 - arxiv.org

Adversarial attacks in computer vision exploit the vulnerabilities of machine learning models
by introducing subtle perturbations to input data, often leading to incorrect predictions or …

被引用次数：1 相关文章所有 3 个版本

[PDF] thecvf.com

Semantically Enhanced Scene Captions with Physical and Weather Condition Changes

H Sakaino - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com

Abstract Vision-Language models (VLMs), ie, image-text pairs of CLIP, have boosted image-
based Deep Learning (DL). Moreover, Visual-Question-Answer (VQA) tools and open …

被引用次数：1 相关文章所有 3 个版本

[PDF] thecvf.com

PV-Cap: 3D Dynamic Scene Understanding Through Open Physics-based Vocabulary

H Sakaino, TN Phuong, VN Duy - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Recently large Vision Language (VL) models ie CLIP have demonstrated
impressive capabilities in training solely on internet-scale image-language pairs. Moreover …

Dynamic Texts From UAV Perspective Natural Images

H Sakaino - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com

Drone-based image processing offers valuable capabilities for surveillance, detection, and
tracking in vast areas, aiding in disaster search and rescue and monitoring artificial events …

被引用次数：2 相关文章所有 4 个版本

Advancing Causal Intervention in Image Captioning With Causal Prompt

Y Yu, Y Kim, YM Ro - IEEE Transactions on Neural Networks …, 2024 - ieeexplore.ieee.org

This article introduces a novel approach, called causal prompting network (CPNet), to
enhance the causal intervention in the context of image captioning. By leveraging visual …