Backbones-review: Feature extraction networks for deep learning and deep reinforcement learning approaches
To understand the real world using various types of data, Artificial Intelligence (AI) is the
most used technique nowadays. While finding the pattern within the analyzed data …
most used technique nowadays. While finding the pattern within the analyzed data …
Planning-oriented autonomous driving
Modern autonomous driving system is characterized as modular tasks in sequential order,
ie, perception, prediction, and planning. In order to perform a wide diversity of tasks and …
ie, perception, prediction, and planning. In order to perform a wide diversity of tasks and …
Grid-centric traffic scenario perception for autonomous driving: A comprehensive review
Grid-centric perception is a crucial field for mobile robot perception and navigation.
Nonetheless, grid-centric perception is less prevalent than object-centric perception for …
Nonetheless, grid-centric perception is less prevalent than object-centric perception for …
Tracking anything with decoupled video segmentation
Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …
Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d
For the last few decades, several major subfields of artificial intelligence including computer
vision, graphics, and robotics have progressed largely independently from each other …
vision, graphics, and robotics have progressed largely independently from each other …
MOSE: A new dataset for video object segmentation in complex scenes
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …
A generalist framework for panoptic segmentation of images and videos
Panoptic segmentation assigns semantic and instance ID labels to every pixel of an image.
As permutations of instance IDs are also valid solutions, the task requires learning of high …
As permutations of instance IDs are also valid solutions, the task requires learning of high …
Sam 2: Segment anything in images and videos
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …
promptable visual segmentation in images and videos. We build a data engine, which …
St-p3: End-to-end vision-based autonomous driving via spatial-temporal feature learning
Many existing autonomous driving paradigms involve a multi-stage discrete pipeline of
tasks. To better predict the control signals and enhance user safety, an end-to-end approach …
tasks. To better predict the control signals and enhance user safety, an end-to-end approach …
A simple single-scale vision transformer for object localization and instance segmentation
This work presents a simple vision transformer design as a strong baseline for object
localization and instance segmentation tasks. Transformers recently demonstrate …
localization and instance segmentation tasks. Transformers recently demonstrate …