GenTron: Diffusion Transformers for Image and Video Generation

S Chen, M Xu, J Ren, Y Cong, S He… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this study we explore Transformer based diffusion models for image and video
generation. Despite the dominance of Transformer architectures in various fields due to their …

Character-preserving coherent story visualization

YZ Song, Z Rui Tam, HJ Chen, HH Lu… - European Conference on …, 2020 - Springer
Story visualization aims at generating a sequence of images to narrate each sentence in a
multi-sentence story. Different from video generation that focuses on maintaining the …

[PDF][PDF] Hamiltonian Operator Disentanglement of Content and Motion in Image Sequences.

MA Khan, AJ Storkey - arXiv preprint arXiv:2112.01641, 2021 - academia.edu
We introduce a deep generative model for image sequences that reliably factorise the latent
space into content and motion variables. To model the diverse dynamics, we split the motion …

Hamiltonian latent operators for content and motion disentanglement in image sequences

A Khan, AJ Storkey - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract We introduce\textit {HALO}--a deep generative model utilising HAmiltonian Latent
Operators to reliably disentangle content and motion information in image sequences …

[PDF][PDF] Hamiltonian prior to Disentangle Content and Motion in Image Sequences

A Khan, A Storkey - arXiv preprint arXiv:2112.01641, 2021 - ctrlgenworkshop.github.io
We present a deep latent variable model for high dimensional sequential data. Our model
factorises the latent space into content and motion variables. To model the diverse …

NeurInt: Learning to Interpolate through Neural ODEs

A Bose, A Das, Y Dandi, P Rai - arXiv preprint arXiv:2111.04123, 2021 - arxiv.org
A wide range of applications require learning image generation models whose latent space
effectively captures the high-level factors of variation present in the data distribution. The …

[PDF][PDF] Knowledge Elicitation using Psychometric Learning

L Yin - 2023 - research.tue.nl
Knowledge Elicitation using Psychometric Learning Page 1 Knowledge Elicitation using
Psychometric Learning Citation for published version (APA): Yin, L. (2023). Knowledge …

[PDF][PDF] Character-Preserving Coherent Story Visualization

HH Shuai - researchgate.net
Story visualization aims at generating a sequence of images to narrate each sentence in a
multi-sentence story. Different from video generation that focuses on maintaining the …