Global aggregation then local distribution in fully convolutional networks

Y Zhang, D Sidibé, O Morel, F Mériaudeau - Image and Vision Computing, 2021 - Elsevier

Recent advances in deep learning have shown excellent performance in various scene
understanding tasks. However, in some complex environments or under challenging …

被引用次数：205 相关文章所有 5 个版本

Recent advances on loss functions in deep learning for computer vision

Y Tian, D Su, S Lauria, X Liu - Neurocomputing, 2022 - Elsevier

The loss function, also known as cost function, is used for training a neural network or other
machine learning models. Over the past decade, researchers have designed many loss …

被引用次数：104 相关文章所有 2 个版本

[PDF] thecvf.com

Mst++: Multi-stage spectral-wise transformer for efficient spectral reconstruction

Y Cai, J Lin, Z Lin, H Wang, Y Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or
wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB …

被引用次数：208 相关文章所有 11 个版本

[PDF] thecvf.com

Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers

S Zheng, J Lu, H Zhao, X Zhu, Z Luo… - Proceedings of the …, 2021 - openaccess.thecvf.com

Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with
an encoder-decoder architecture. The encoder progressively reduces the spatial resolution …

被引用次数：3647 相关文章所有 10 个版本

[PDF] baai.ac.cn

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

被引用次数：2514 相关文章所有 7 个版本

[PDF] arxiv.org

A survey on visual transformer

K Han, Y Wang, H Chen, X Chen, J Guo, Z Liu… - arXiv preprint arXiv …, 2020 - arxiv.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

被引用次数：386 相关文章所有 3 个版本

[PDF] thecvf.com

Strip pooling: Rethinking spatial pooling for scene parsing

Q Hou, L Zhang, MM Cheng… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com

Spatial pooling has been proven highly effective to capture long-range contextual
information for pixel-wise prediction tasks, such as scene parsing. In this paper, beyond …

被引用次数：706 相关文章所有 11 个版本

[PDF] arxiv.org

Object-contextual representations for semantic segmentation

Y Yuan, X Chen, J Wang - Computer Vision–ECCV 2020: 16th European …, 2020 - Springer

In this paper, we study the context aggregation problem in semantic segmentation.
Motivated by that the label of a pixel is the category of the object that the pixel belongs to, we …

被引用次数：1773 相关文章所有 9 个版本

[PDF] arxiv.org

Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-D semantic segmentation

X Chen, KY Lin, J Wang, W Wu, C Qian, H Li… - European conference on …, 2020 - Springer

Depth information has proven to be a useful cue in the semantic segmentation of RGB-D
images for providing a geometric counterpart to the RGB representation. Most existing works …

被引用次数：378 相关文章所有 6 个版本

[PDF] arxiv.org

Improving semantic segmentation via decoupled body and edge supervision

X Li, X Li, L Zhang, G Cheng, J Shi, Z Lin, S Tan… - Computer Vision–ECCV …, 2020 - Springer

Existing semantic segmentation approaches either aim to improve the object's inner
consistency by modeling the global context, or refine objects detail along their boundaries …

被引用次数：308 相关文章所有 8 个版本