Backbones-review: Feature extraction networks for deep learning and deep reinforcement learning approaches

O Elharrouss, Y Akbari, N Almaadeed… - arXiv preprint arXiv …, 2022 - arxiv.org
To understand the real world using various types of data, Artificial Intelligence (AI) is the
most used technique nowadays. While finding the pattern within the analyzed data …

Planning-oriented autonomous driving

Y Hu, J Yang, L Chen, K Li, C Sima… - Proceedings of the …, 2023 - openaccess.thecvf.com
Modern autonomous driving system is characterized as modular tasks in sequential order,
ie, perception, prediction, and planning. In order to perform a wide diversity of tasks and …

Grid-centric traffic scenario perception for autonomous driving: A comprehensive review

Y Shi, K Jiang, J Li, J Wen, Z Qian, M Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Grid-centric perception is a crucial field for mobile robot perception and navigation.
Nonetheless, grid-centric perception is less prevalent than object-centric perception for …

Tracking anything with decoupled video segmentation

HK Cheng, SW Oh, B Price… - Proceedings of the …, 2023 - openaccess.thecvf.com
Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …

Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d

Y Liao, J Xie, A Geiger - IEEE Transactions on Pattern Analysis …, 2022 - ieeexplore.ieee.org
For the last few decades, several major subfields of artificial intelligence including computer
vision, graphics, and robotics have progressed largely independently from each other …

MOSE: A new dataset for video object segmentation in complex scenes

H Ding, C Liu, S He, X Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …

A generalist framework for panoptic segmentation of images and videos

T Chen, L Li, S Saxena, G Hinton… - Proceedings of the …, 2023 - openaccess.thecvf.com
Panoptic segmentation assigns semantic and instance ID labels to every pixel of an image.
As permutations of instance IDs are also valid solutions, the task requires learning of high …

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

St-p3: End-to-end vision-based autonomous driving via spatial-temporal feature learning

S Hu, L Chen, P Wu, H Li, J Yan, D Tao - European Conference on …, 2022 - Springer
Many existing autonomous driving paradigms involve a multi-stage discrete pipeline of
tasks. To better predict the control signals and enhance user safety, an end-to-end approach …

A simple single-scale vision transformer for object localization and instance segmentation

W Chen, X Du, F Yang, L Beyer, X Zhai, TY Lin… - arXiv preprint arXiv …, 2021 - arxiv.org
This work presents a simple vision transformer design as a strong baseline for object
localization and instance segmentation tasks. Transformers recently demonstrate …