Deep multimodal fusion for semantic image segmentation: A survey
Recent advances in deep learning have shown excellent performance in various scene
understanding tasks. However, in some complex environments or under challenging …
understanding tasks. However, in some complex environments or under challenging …
Recent advances on loss functions in deep learning for computer vision
The loss function, also known as cost function, is used for training a neural network or other
machine learning models. Over the past decade, researchers have designed many loss …
machine learning models. Over the past decade, researchers have designed many loss …
Mst++: Multi-stage spectral-wise transformer for efficient spectral reconstruction
Existing leading methods for spectral reconstruction (SR) focus on designing deeper or
wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB …
wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB …
Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers
Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with
an encoder-decoder architecture. The encoder progressively reduces the spatial resolution …
an encoder-decoder architecture. The encoder progressively reduces the spatial resolution …
A survey on vision transformer
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …
network mainly based on the self-attention mechanism. Thanks to its strong representation …
A survey on visual transformer
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …
network mainly based on the self-attention mechanism. Thanks to its strong representation …
Strip pooling: Rethinking spatial pooling for scene parsing
Spatial pooling has been proven highly effective to capture long-range contextual
information for pixel-wise prediction tasks, such as scene parsing. In this paper, beyond …
information for pixel-wise prediction tasks, such as scene parsing. In this paper, beyond …
Object-contextual representations for semantic segmentation
In this paper, we study the context aggregation problem in semantic segmentation.
Motivated by that the label of a pixel is the category of the object that the pixel belongs to, we …
Motivated by that the label of a pixel is the category of the object that the pixel belongs to, we …
Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-D semantic segmentation
Depth information has proven to be a useful cue in the semantic segmentation of RGB-D
images for providing a geometric counterpart to the RGB representation. Most existing works …
images for providing a geometric counterpart to the RGB representation. Most existing works …
Improving semantic segmentation via decoupled body and edge supervision
Existing semantic segmentation approaches either aim to improve the object's inner
consistency by modeling the global context, or refine objects detail along their boundaries …
consistency by modeling the global context, or refine objects detail along their boundaries …