- 学术资源搜索

A comprehensive survey of deep learning for image captioning

MDZ Hossain, F Sohel, MF Shiratuddin… - ACM Computing Surveys …, 2019 - dl.acm.org

Generating a description of an image is called image captioning. Image captioning requires
recognizing the important objects, their attributes, and their relationships in an image. It also …

被引用次数：984 相关文章所有 8 个版本

[HTML] nih.gov

[HTML][HTML] Human gaze assisted artificial intelligence: A review

R Zhang, A Saran, B Liu, Y Zhu, S Guo… - IJCAI: Proceedings of …, 2020 - ncbi.nlm.nih.gov

Human gaze reveals a wealth of information about internal cognitive state. Thus, gaze-
related research has significantly increased in computer vision, natural language …

被引用次数：76 相关文章所有 7 个版本

[PDF] neurips.cc

Reco: Retrieve and co-segment for zero-shot transfer

G Shin, W Xie, S Albanie - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Semantic segmentation has a broad range of applications, but its real-world impact has
been significantly limited by the prohibitive annotation costs necessary to enable …

被引用次数：94 相关文章所有 5 个版本

[PDF] thecvf.com

Latent embeddings for zero-shot classification

Y Xian, Z Akata, G Sharma, Q Nguyen… - Proceedings of the …, 2016 - openaccess.thecvf.com

We present a novel latent embedding model for learning a compatibility function between
image and class embeddings, in the context of zero-shot classification. The proposed …

被引用次数：871 相关文章所有 13 个版本

Class attention network for image recognition

G Cheng, P Lai, D Gao, J Han - Science China Information Sciences, 2023 - Springer

Visual attention has become a popular and widely used component for image recognition.
Although various attention-based methods have been proposed and achieved relatively …

被引用次数：76 相关文章所有 3 个版本

[PDF] arxiv.org

What's the point: Semantic segmentation with point supervision

A Bearman, O Russakovsky, V Ferrari… - European conference on …, 2016 - Springer

The semantic image segmentation task presents a trade-off between test time accuracy and
training time annotation cost. Detailed per-pixel annotations enable training accurate …

被引用次数：1221 相关文章所有 7 个版本

[PDF] thecvf.com

Evaluating weakly supervised object localization methods right

J Choe, SJ Oh, S Lee, S Chun… - Proceedings of the …, 2020 - openaccess.thecvf.com

Weakly-supervised object localization (WSOL) has gained popularity over the last years for
its promise to train localization models with only image-level labels. Since the seminal …

被引用次数：225 相关文章所有 12 个版本

[PDF] cv-foundation.org

Salicon: Saliency in context

M Jiang, S Huang, J Duan, Q Zhao - Proceedings of the IEEE …, 2015 - cv-foundation.org

Saliency in Context (SALICON) is an ongoing effort that aims at understanding and
predicting visual attention. This paper presents a new method to collect large-scale human …

被引用次数：879 相关文章所有 12 个版本

[PDF] thecvf.com

Visual attention consistency under image transforms for multi-label image classification

H Guo, K Zheng, X Fan, H Yu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Human visual perception shows good consistency for many multi-label image classification
tasks under certain spatial transforms, such as scaling, rotation, flipping and translation. This …

被引用次数：284 相关文章所有 10 个版本

[PDF] thecvf.com

Large-scale interactive object segmentation with human annotators

R Benenson, S Popov, V Ferrari - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Manually annotating object segmentation masks is very time consuming. Interactive object
segmentation methods offer a more efficient alternative where a human annotator and a …

被引用次数：288 相关文章所有 9 个版本