Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d

Y Liao, J Xie, A Geiger - IEEE Transactions on Pattern Analysis …, 2022 - ieeexplore.ieee.org
For the last few decades, several major subfields of artificial intelligence including computer
vision, graphics, and robotics have progressed largely independently from each other …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

A comprehensive review of modern object segmentation approaches

Y Wang, U Ahsan, H Li, M Hagen - Foundations and Trends® …, 2022 - nowpublishers.com
Image segmentation is the task of associating pixels in an image with their respective object
class labels. It has a wide range of applications in many industries including healthcare …

Semantic flow for fast and accurate scene parsing

X Li, A You, Z Zhu, H Zhao, M Yang, K Yang… - Computer Vision–ECCV …, 2020 - Springer
In this paper, we focus on designing effective method for fast and accurate scene parsing. A
common practice to improve the performance is to attain high resolution feature maps with …

Self-supervised learning of audio-visual objects from video

T Afouras, A Owens, JS Chung, A Zisserman - Computer Vision–ECCV …, 2020 - Springer
Our objective is to transform a video into a set of discrete audio-visual objects using self-
supervised learning. To this end, we introduce a model that uses attention to localize and …

Psanet: Point-wise spatial attention network for scene parsing

H Zhao, Y Zhang, S Liu, J Shi… - Proceedings of the …, 2018 - openaccess.thecvf.com
We notice information flow in convolutional neural networks is restricted inside local
neighborhood regions due to the physical design of convolutional filters, which limits the …

Softmax splatting for video frame interpolation

S Niklaus, F Liu - Proceedings of the IEEE/CVF conference …, 2020 - openaccess.thecvf.com
Differentiable image sampling in the form of backward warping has seen broad adoption in
tasks like depth estimation and optical flow prediction. In contrast, how to perform forward …

The apolloscape dataset for autonomous driving

X Huang, X Cheng, Q Geng, B Cao… - Proceedings of the …, 2018 - openaccess.thecvf.com
Scene parsing aims to assign a class (semantic) label for each pixel in an image. It is a
comprehensive analysis of an image. Given the rise of autonomous driving, pixel-accurate …

Improving semantic segmentation via video propagation and label relaxation

Y Zhu, K Sapra, FA Reda, KJ Shih… - Proceedings of the …, 2019 - openaccess.thecvf.com
Semantic segmentation requires large amounts of pixel-wise annotations to learn accurate
models. In this paper, we present a video prediction-based methodology to scale up training …

Large-scale video panoptic segmentation in the wild: A benchmark

J Miao, X Wang, Y Wu, W Li, X Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com
In this paper, we present a new large-scale dataset for the video panoptic segmentation
task, which aims to assign semantic classes and track identities to all pixels in a video. As …