Trf-net: A transformer-based rgb-d fusion network for desktop object instance segmentation

H Cao, Y Zhang, D Shan, X Liu, J Zhao - Neural Computing and …, 2023 - Springer
H Cao, Y Zhang, D Shan, X Liu, J Zhao
Neural Computing and Applications, 2023Springer
To perform object-specific tasks on the desktop, robots need to perceive different objects.
The challenge is to calculate the pixel-wise mask for each object, even in the presence of
occlusions and unseen objects. We take a step toward this problem by proposing a metric
learning-based network called TRF-Net to perform desktop object instance segmentation.
We design two ResNet-based branches to process the RGB and depth images separately.
Then, we propose a Transformer-based fusion module called TranSE to fuse the features …
Abstract
To perform object-specific tasks on the desktop, robots need to perceive different objects. The challenge is to calculate the pixel-wise mask for each object, even in the presence of occlusions and unseen objects. We take a step toward this problem by proposing a metric learning-based network called TRF-Net to perform desktop object instance segmentation. We design two ResNet-based branches to process the RGB and depth images separately. Then, we propose a Transformer-based fusion module called TranSE to fuse the features from both branches. This module also transfers the fused features to the decoder part, which helps generate fine-grained decoder features. After that, we propose a multi-scale feature embedding loss function called MFE loss to reduce the intra-class distance and increase the inter-class distance, which contributes to the feature clustering in embedding space. Due to the lack of large-scale real-world datasets for desktop objects, the proposed TRF-Net is trained with the synthetic dataset and tested with the small-scale real-world dataset. The target objects in the testing dataset do not present in the training dataset, ensuring the novelty of testing objects. We demonstrate that our method can produce accurate instance segmentation masks, outperforming other state-of-the-art methods on desktop object instance segmentation.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果