Imaginator: Conditional spatio-temporal gan for video generation

Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward

M Masood, M Nawaz, KM Malik, A Javed, A Irtaza… - Applied …, 2023 - Springer

Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …

被引用次数：381 相关文章所有 11 个版本

[PDF] arxiv.org

Deep gait recognition: A survey

A Sepas-Moghaddam, A Etemad - IEEE transactions on pattern …, 2022 - ieeexplore.ieee.org

Gait recognition is an appealing biometric modality which aims to identify individuals based
on the way they walk. Deep learning has reshaped the research landscape in this area …

被引用次数：227 相关文章所有 7 个版本

[PDF] arxiv.org

Dynamicrafter: Animating open-domain images with video diffusion priors

J Xing, M Xia, Y Zhang, H Chen, W Yu, H Liu… - … on Computer Vision, 2025 - Springer

Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …

被引用次数：134 相关文章所有 2 个版本

[PDF] thecvf.com

Conditional image-to-video generation with latent flow diffusion models

H Ni, C Shi, K Li, SX Huang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Conditional image-to-video (cI2V) generation aims to synthesize a new plausible video
starting from an image (eg, a person's face) and a condition (eg, an action class label like …

被引用次数：142 相关文章所有 6 个版本

[PDF] arxiv.org

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

被引用次数：194 相关文章所有 3 个版本

[PDF] acm.org

Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling

X Shi, Z Huang, FY Wang, W Bian, D Li… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org

We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …

被引用次数：45 相关文章所有 2 个版本

[PDF] thecvf.com

Joint generative and contrastive learning for unsupervised person re-identification

H Chen, Y Wang, B Lagadec… - Proceedings of the …, 2021 - openaccess.thecvf.com

Recent self-supervised contrastive learning provides an effective approach for unsupervised
person re-identification (ReID) by learning invariance from different views (transformed …

被引用次数：198 相关文章所有 13 个版本

[PDF] arxiv.org

Latent image animator: Learning to animate images via latent space navigation

Y Wang, D Yang, F Bremond, A Dantcheva - arXiv preprint arXiv …, 2022 - arxiv.org

Due to the remarkable progress of deep generative models, animating images has become
increasingly efficient, whereas associated results have become increasingly realistic …

被引用次数：153 相关文章所有 7 个版本

[PDF] arxiv.org

The creation and detection of deepfakes: A survey

Y Mirsky, W Lee - ACM computing surveys (CSUR), 2021 - dl.acm.org

Generative deep learning algorithms have progressed to a point where it is difficult to tell the
difference between what is real and what is fake. In 2018, it was discovered how easy it is to …

被引用次数：772 相关文章所有 9 个版本

[PDF] arxiv.org

Latte: Latent diffusion transformer for video generation

X Ma, Y Wang, G Jia, X Chen, Z Liu, YF Li… - arXiv preprint arXiv …, 2024 - arxiv.org

We propose a novel Latent Diffusion Transformer, namely Latte, for video generation. Latte
first extracts spatio-temporal tokens from input videos and then adopts a series of …

被引用次数：138 相关文章所有 2 个版本