[HTML][HTML] A review on deep learning in UAV remote sensing
Abstract Deep Neural Networks (DNNs) learn representation from data with an impressive
capability, and brought important breakthroughs for processing images, time-series, natural …
capability, and brought important breakthroughs for processing images, time-series, natural …
A review on 2D instance segmentation based on deep neural networks
W Gu, S Bai, L Kong - Image and Vision Computing, 2022 - Elsevier
Image instance segmentation involves labeling pixels of images with classes and instances,
which is one of the pivotal technologies in many domains, such as natural scenes …
which is one of the pivotal technologies in many domains, such as natural scenes …
Image as a foreign language: Beit pretraining for vision and vision-language tasks
A big convergence of language, vision, and multimodal pretraining is emerging. In this work,
we introduce a general-purpose multimodal foundation model BEiT-3, which achieves …
we introduce a general-purpose multimodal foundation model BEiT-3, which achieves …
Internimage: Exploring large-scale vision foundation models with deformable convolutions
Compared to the great progress of large-scale vision transformers (ViTs) in recent years,
large-scale models based on convolutional neural networks (CNNs) are still in an early …
large-scale models based on convolutional neural networks (CNNs) are still in an early …
Eva: Exploring the limits of masked visual representation learning at scale
We launch EVA, a vision-centric foundation model to explore the limits of visual
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …
Diffusiondet: Diffusion model for object detection
We propose DiffusionDet, a new framework that formulates object detection as a denoising
diffusion process from noisy boxes to object boxes. During the training stage, object boxes …
diffusion process from noisy boxes to object boxes. During the training stage, object boxes …
Vision mamba: Efficient visual representation learning with bidirectional state space model
Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the
Mamba deep learning model, have shown great potential for long sequence modeling …
Mamba deep learning model, have shown great potential for long sequence modeling …
Eva-02: A visual representation for neon genesis
We launch EVA-02, a next-generation Transformer-based visual representation pre-trained
to reconstruct strong and robust language-aligned vision features via masked image …
to reconstruct strong and robust language-aligned vision features via masked image …
Vision transformer adapter for dense predictions
This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …
recent visual transformers that introduce vision-specific inductive biases into their …
Universal instance perception as object discovery and retrieval
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …
as category names, language expressions, and target annotations, but this complete field …