Semantic video cnns through representation warping

Y Liao, J Xie, A Geiger - IEEE Transactions on Pattern Analysis …, 2022 - ieeexplore.ieee.org

For the last few decades, several major subfields of artificial intelligence including computer
vision, graphics, and robotics have progressed largely independently from each other …

被引用次数：498 相关文章所有 13 个版本

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

被引用次数：69 相关文章所有 3 个版本

[PDF] nowpublishers.com

A comprehensive review of modern object segmentation approaches

Y Wang, U Ahsan, H Li, M Hagen - Foundations and Trends® …, 2022 - nowpublishers.com

Image segmentation is the task of associating pixels in an image with their respective object
class labels. It has a wide range of applications in many industries including healthcare …

被引用次数：24 相关文章所有 5 个版本

[PDF] arxiv.org

Semantic flow for fast and accurate scene parsing

X Li, A You, Z Zhu, H Zhao, M Yang, K Yang… - Computer Vision–ECCV …, 2020 - Springer

In this paper, we focus on designing effective method for fast and accurate scene parsing. A
common practice to improve the performance is to attain high resolution feature maps with …

被引用次数：417 相关文章所有 6 个版本

[PDF] arxiv.org

Self-supervised learning of audio-visual objects from video

T Afouras, A Owens, JS Chung, A Zisserman - Computer Vision–ECCV …, 2020 - Springer

Our objective is to transform a video into a set of discrete audio-visual objects using self-
supervised learning. To this end, we introduce a model that uses attention to localize and …

被引用次数：266 相关文章所有 8 个版本

[PDF] thecvf.com

Psanet: Point-wise spatial attention network for scene parsing

H Zhao, Y Zhang, S Liu, J Shi… - Proceedings of the …, 2018 - openaccess.thecvf.com

We notice information flow in convolutional neural networks is restricted inside local
neighborhood regions due to the physical design of convolutional filters, which limits the …

被引用次数：1244 相关文章所有 14 个版本

[PDF] thecvf.com

Softmax splatting for video frame interpolation

S Niklaus, F Liu - Proceedings of the IEEE/CVF conference …, 2020 - openaccess.thecvf.com

Differentiable image sampling in the form of backward warping has seen broad adoption in
tasks like depth estimation and optical flow prediction. In contrast, how to perform forward …

被引用次数：381 相关文章所有 10 个版本

[PDF] thecvf.com

The apolloscape dataset for autonomous driving

X Huang, X Cheng, Q Geng, B Cao… - Proceedings of the …, 2018 - openaccess.thecvf.com

Scene parsing aims to assign a class (semantic) label for each pixel in an image. It is a
comprehensive analysis of an image. Given the rise of autonomous driving, pixel-accurate …

被引用次数：1257 相关文章所有 20 个版本

[PDF] thecvf.com

Improving semantic segmentation via video propagation and label relaxation

Y Zhu, K Sapra, FA Reda, KJ Shih… - Proceedings of the …, 2019 - openaccess.thecvf.com

Semantic segmentation requires large amounts of pixel-wise annotations to learn accurate
models. In this paper, we present a video prediction-based methodology to scale up training …

被引用次数：494 相关文章所有 6 个版本

[PDF] thecvf.com

Large-scale video panoptic segmentation in the wild: A benchmark

J Miao, X Wang, Y Wu, W Li, X Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

In this paper, we present a new large-scale dataset for the video panoptic segmentation
task, which aims to assign semantic classes and track identities to all pixels in a video. As …

被引用次数：73 相关文章所有 4 个版本