Image transformer

[HTML][HTML] Transformers in medical image analysis

K He, C Gan, Z Li, I Rekik, Z Yin, W Ji, Y Gao, Q Wang… - Intelligent …, 2023 - Elsevier

Transformers have dominated the field of natural language processing and have recently
made an impact in the area of computer vision. In the field of medical image analysis …

被引用次数：231 相关文章所有 12 个版本

[HTML] sciencedirect.com

[HTML][HTML] A survey of transformers

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier

Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

被引用次数：980 相关文章所有 4 个版本

[PDF] thecvf.com

Scalable diffusion models with transformers

W Peebles, S Xie - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

We explore a new class of diffusion models based on the transformer architecture. We train
latent diffusion models of images, replacing the commonly-used U-Net backbone with a …

被引用次数：571 相关文章所有 6 个版本

[PDF] arxiv.org

Rt-1: Robotics transformer for real-world control at scale

A Brohan, N Brown, J Carbajal, Y Chebotar… - arXiv preprint arXiv …, 2022 - arxiv.org

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine
learning models can solve specific downstream tasks either zero-shot or with small task …

被引用次数：512 相关文章所有 3 个版本

[PDF] neurips.cc

What can transformers learn in-context? a case study of simple function classes

S Garg, D Tsipras, PS Liang… - Advances in Neural …, 2022 - proceedings.neurips.cc

In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …

被引用次数：280 相关文章所有 7 个版本

[PDF] arxiv.org

Make-a-scene: Scene-based text-to-image generation with human priors

O Gafni, A Polyak, O Ashual, S Sheynin… - … on Computer Vision, 2022 - Springer

Recent text-to-image generation methods provide a simple yet exciting conversion capability
between text and image domains. While these methods have incrementally improved the …

被引用次数：379 相关文章所有 4 个版本

[PDF] thecvf.com

Maskgit: Masked generative image transformer

H Chang, H Zhang, L Jiang, C Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Generative transformers have experienced rapid popularity growth in the computer vision
community in synthesizing high-fidelity and high-resolution images. The best generative …

被引用次数：353 相关文章所有 6 个版本

[PDF] thecvf.com

Efficient and explicit modelling of image hierarchies for image restoration

Y Li, Y Fan, X Xiang, D Demandolx… - Proceedings of the …, 2023 - openaccess.thecvf.com

The aim of this paper is to propose a mechanism to efficiently and explicitly model image
hierarchies in the global, regional, and local range for image restoration. To achieve that, we …

被引用次数：118 相关文章所有 6 个版本

[PDF] thecvf.com

Vector quantized diffusion model for text-to-image synthesis

S Gu, D Chen, J Bao, F Wen, B Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

We present the vector quantized diffusion (VQ-Diffusion) model for text-to-image generation.
This method is based on a vector quantized variational autoencoder (VQ-VAE) whose latent …

被引用次数：605 相关文章所有 10 个版本

[PDF] acm.org Full View

Palette: Image-to-image diffusion models

C Saharia, W Chan, H Chang, C Lee, J Ho… - ACM SIGGRAPH 2022 …, 2022 - dl.acm.org

This paper develops a unified framework for image-to-image translation based on
conditional diffusion models and evaluates this framework on four challenging image-to …

被引用次数：1007 相关文章所有 10 个版本