- 学术资源搜索

Remote sensing image scene classification: Benchmark and state of the art

G Cheng, J Han, X Lu - Proceedings of the IEEE, 2017 - ieeexplore.ieee.org

Remote sensing image scene classification plays an important role in a wide range of
applications and hence has been receiving remarkable attention. During the past years …

被引用次数：2405 相关文章所有 5 个版本

Efficient structure from motion for large-scale UAV images: A review and a comparison of SfM tools

S Jiang, C Jiang, W Jiang - ISPRS Journal of Photogrammetry and Remote …, 2020 - Elsevier

Unmanned aerial vehicle (UAV) images have gained extensive attention in varying fields,
and the Structure from Motion (SfM) technique has become the gold standard for aerial …

被引用次数：190 相关文章所有 5 个版本

[PDF] arxiv.org

Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arXiv preprint arXiv …, 2023 - arxiv.org

The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

被引用次数：1073 相关文章所有 11 个版本

[PDF] thecvf.com

Diffusion art or digital forgery? investigating data replication in diffusion models

G Somepalli, V Singla, M Goldblum… - Proceedings of the …, 2023 - openaccess.thecvf.com

Cutting-edge diffusion models produce images with high quality and customizability,
enabling them to be used for commercial art and graphic design purposes. But do diffusion …

被引用次数：197 相关文章所有 6 个版本

[PDF] thecvf.com

Emerging properties in self-supervised vision transformers

M Caron, H Touvron, I Misra, H Jégou… - Proceedings of the …, 2021 - openaccess.thecvf.com

In this paper, we question if self-supervised learning provides new properties to Vision
Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the …

被引用次数：4557 相关文章所有 16 个版本

[PDF] neurips.cc

Battle of the backbones: A large-scale comparison of pretrained models across computer vision tasks

M Goldblum, H Souri, R Ni, M Shu… - Advances in …, 2024 - proceedings.neurips.cc

Neural network based computer vision systems are typically built on a backbone, a
pretrained or randomly initialized feature extractor. Several years ago, the default option was …

被引用次数：25 相关文章所有 5 个版本

[PDF] arxiv.org

Movienet: A holistic dataset for movie understanding

Q Huang, Y Xiong, A Rao, J Wang, D Lin - Computer Vision–ECCV 2020 …, 2020 - Springer

Recent years have seen remarkable advances in visual understanding. However, how to
understand a story-based long video with artistic styles, eg movie, remains challenging. In …

被引用次数：221 相关文章所有 4 个版本

[PDF] arxiv.org

Vision models are more robust and fair when pretrained on uncurated images without supervision

P Goyal, Q Duval, I Seessel, M Caron, I Misra… - arXiv preprint arXiv …, 2022 - arxiv.org

Discriminative self-supervised learning allows training models on any random group of
internet images, and possibly recover salient information that helps differentiate between the …

被引用次数：106 相关文章所有 3 个版本

[PDF] neurips.cc

Are labels required for improving adversarial robustness?

JB Alayrac, J Uesato, PS Huang… - Advances in …, 2019 - proceedings.neurips.cc

Recent work has uncovered the interesting (and somewhat surprising) finding that training
models to be invariant to adversarial perturbations requires substantially larger datasets …

被引用次数：358 相关文章所有 7 个版本

[PDF] ieee.org

Region-based convolutional networks for accurate object detection and segmentation

R Girshick, J Donahue, T Darrell… - IEEE transactions on …, 2015 - ieeexplore.ieee.org

Object detection performance, as measured on the canonical PASCAL VOC Challenge
datasets, plateaued in the final years of the competition. The best-performing methods were …

被引用次数：3246 相关文章所有 8 个版本