FAIR1M: A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery
With the rapid development of deep learning, many deep learning-based approaches have
made great achievements in object detection tasks. It is generally known that deep learning …
made great achievements in object detection tasks. It is generally known that deep learning …
Artificial intelligence in the creative industries: a review
N Anantrasirichai, D Bull - Artificial intelligence review, 2022 - Springer
This paper reviews the current state of the art in artificial intelligence (AI) technologies and
applications in the context of the creative industries. A brief background of AI, and …
applications in the context of the creative industries. A brief background of AI, and …
Convnext v2: Co-designing and scaling convnets with masked autoencoders
Driven by improved architectures and better representation learning frameworks, the field of
visual recognition has enjoyed rapid modernization and performance boost in the early …
visual recognition has enjoyed rapid modernization and performance boost in the early …
Biformer: Vision transformer with bi-level routing attention
As the core building block of vision transformers, attention is a powerful tool to capture long-
range dependency. However, such power comes at a cost: it incurs a huge computation …
range dependency. However, such power comes at a cost: it incurs a huge computation …
Diffusiondet: Diffusion model for object detection
We propose DiffusionDet, a new framework that formulates object detection as a denoising
diffusion process from noisy boxes to object boxes. During the training stage, object boxes …
diffusion process from noisy boxes to object boxes. During the training stage, object boxes …
Vision mamba: Efficient visual representation learning with bidirectional state space model
Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the
Mamba deep learning model, have shown great potential for long sequence modeling …
Mamba deep learning model, have shown great potential for long sequence modeling …
Vision transformer adapter for dense predictions
This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …
recent visual transformers that introduce vision-specific inductive biases into their …
Efficientvit: Memory efficient vision transformer with cascaded group attention
Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …
However, their remarkable performance is accompanied by heavy computation costs, which …
Scaling up your kernels to 31x31: Revisiting large kernel design in cnns
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …
Dino: Detr with improved denoising anchor boxes for end-to-end object detection
We present DINO (\textbf {D} ETR with\textbf {I} mproved de\textbf {N} oising anch\textbf {O} r
boxes), a state-of-the-art end-to-end object detector.% in this paper. DINO improves over …
boxes), a state-of-the-art end-to-end object detector.% in this paper. DINO improves over …