NVAE: A deep hierarchical variational autoencoder

A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt

Y Cao, S Li, Y Liu, Z Yan, Y Dai, PS Yu… - arXiv preprint arXiv …, 2023 - arxiv.org

Recently, ChatGPT, along with DALL-E-2 and Codex, has been gaining significant attention
from society. As a result, many individuals have become interested in related resources and …

被引用次数：514 相关文章所有 2 个版本

A comprehensive survey on design and application of autoencoder in deep learning

P Li, Y Pei, J Li - Applied Soft Computing, 2023 - Elsevier

Autoencoder is an unsupervised learning model, which can automatically learn data
features from a large number of samples and can act as a dimensionality reduction method …

被引用次数：102 相关文章所有 2 个版本

[PDF] neurips.cc

Lion: Latent point diffusion models for 3d shape generation

A Vahdat, F Williams, Z Gojcic… - Advances in …, 2022 - proceedings.neurips.cc

Denoising diffusion models (DDMs) have shown promising results in 3D point cloud
synthesis. To advance 3D DDMs and make them useful for digital artists, we require (i) high …

被引用次数：313 相关文章所有 5 个版本

[PDF] arxiv.org

Mastering diverse domains through world models

D Hafner, J Pasukonis, J Ba, T Lillicrap - arXiv preprint arXiv:2301.04104, 2023 - arxiv.org

Developing a general algorithm that learns to solve tasks across a wide range of
applications has been a fundamental challenge in artificial intelligence. Although current …

被引用次数：324 相关文章所有 2 个版本

[PDF] thecvf.com

Rodin: A generative model for sculpting 3d digital avatars using diffusion

T Wang, B Zhang, T Zhang, S Gu… - Proceedings of the …, 2023 - openaccess.thecvf.com

This paper presents a 3D diffusion model that automatically generates 3D digital avatars
represented as neural radiance fields (NeRFs). A significant challenge for 3D diffusion is …

被引用次数：189 相关文章所有 9 个版本

[PDF] arxiv.org

Next-gpt: Any-to-any multimodal llm

S Wu, H Fei, L Qu, W Ji, TS Chua - arXiv preprint arXiv:2309.05519, 2023 - arxiv.org

While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides,
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …

被引用次数：222 相关文章所有 4 个版本

[PDF] arxiv.org

Diffusion models for adversarial purification

W Nie, B Guo, Y Huang, C Xiao, A Vahdat… - arXiv preprint arXiv …, 2022 - arxiv.org

Adversarial purification refers to a class of defense methods that remove adversarial
perturbations using a generative model. These methods do not make assumptions on the …

被引用次数：315 相关文章所有 8 个版本

Make-a-scene: Scene-based text-to-image generation with human priors

O Gafni, A Polyak, O Ashual, S Sheynin… - … on Computer Vision, 2022 - Springer

Recent text-to-image generation methods provide a simple yet exciting conversion capability
between text and image domains. While these methods have incrementally improved the …

被引用次数：386 相关文章所有 4 个版本

[PDF] thecvf.com

Physdiff: Physics-guided human motion diffusion model

Y Yuan, J Song, U Iqbal, A Vahdat… - Proceedings of the …, 2023 - openaccess.thecvf.com

Denoising diffusion models hold great promise for generating diverse and realistic human
motions. However, existing motion diffusion models largely disregard the laws of physics in …

被引用次数：141 相关文章所有 5 个版本

[PDF] thecvf.com

Maskgit: Masked generative image transformer

H Chang, H Zhang, L Jiang, C Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Generative transformers have experienced rapid popularity growth in the computer vision
community in synthesizing high-fidelity and high-resolution images. The best generative …

被引用次数：361 相关文章所有 6 个版本