A survey on video diffusion models

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2023 - dl.acm.org
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc
Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

Fact: Frame-action cross-attention temporal modeling for efficient action segmentation

Z Lu, E Elhamifar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
We study supervised action segmentation whose goal is to predict framewise action labels
of a video. To capture temporal dependencies over long horizons prior works either improve …

Progress-aware online action segmentation for egocentric procedural task videos

Y Shen, E Elhamifar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
We address the problem of online action segmentation for egocentric procedural task
videos. While previous studies have mostly focused on offline action segmentation where …

Temporal action segmentation: An analysis of modern techniques

G Ding, F Sener, A Yao - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Temporal action segmentation (TAS) in videos aims at densely identifying video frames in
minutes-long videos with multiple action classes. As a long-range video understanding task …

Learning to schedule in diffusion probabilistic models

Y Wang, X Wang, AD Dinh, B Du, C Xu - Proceedings of the 29th ACM …, 2023 - dl.acm.org
Recently, the field of generative models has seen a significant advancement with the
introduction of Diffusion Probabilistic Models (DPMs). The Denoising Diffusion Implicit Model …

Action Detection via an Image Diffusion Process

LG Foo, T Li, H Rahmani, J Liu - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Action detection aims to localize the starting and ending points of action instances in
untrimmed videos and predict the classes of those instances. In this paper we make the …

Rethinking conditional diffusion sampling with progressive guidance

AD Dinh, D Liu, C Xu - Advances in Neural Information …, 2024 - proceedings.neurips.cc
This paper tackles two critical challenges encountered in classifier guidance for diffusion
generative models, ie, the lack of diversity and the presence of adversarial effects. These …

Semantic2Graph: graph-based multi-modal feature fusion for action segmentation in videos

J Zhang, PH Tsai, MH Tsai - Applied Intelligence, 2024 - Springer
Video action segmentation have been widely applied in many fields. Most previous studies
employed video-based vision models for this purpose. However, they often rely on a large …