Multimodal image synthesis and editing: A survey and taxonomy

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

A review on generative adversarial networks: Algorithms, theory, and applications

J Gui, Z Sun, Y Wen, D Tao, J Ye - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …

Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations

M Zhao, F Bao, C Li, J Zhu - Advances in Neural …, 2022 - proceedings.neurips.cc
Score-based diffusion models (SBDMs) have achieved the SOTA FID results in unpaired
image-to-image translation (I2I). However, we notice that existing methods totally ignore the …

Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives

K Grauman, A Westbury, L Torresani… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …

Image-to-image translation: Methods and applications

Y Pang, J Lin, T Qin, Z Chen - IEEE Transactions on Multimedia, 2021 - ieeexplore.ieee.org
Image-to-image translation (I2I) aims to transfer images from a source domain to a target
domain while preserving the content representations. I2I has drawn increasing attention and …

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

Unbalanced feature transport for exemplar-based image translation

F Zhan, Y Yu, K Cui, G Zhang, S Lu… - Proceedings of the …, 2021 - openaccess.thecvf.com
Despite the great success of GANs in images translation with different conditioned inputs
such as semantic segmentation and edge map, generating high-fidelity images with …

Generative Adversarial Networks in the built environment: A comprehensive review of the application of GANs across data types and scales

AN Wu, R Stouffs, F Biljecki - Building and Environment, 2022 - Elsevier
Abstract Generative Adversarial Networks (GANs) are a type of deep neural network that
have achieved many state-of-the-art results for generative tasks. GANs can be useful in the …

Transformer-based attention networks for continuous pixel-wise prediction

G Yang, H Tang, M Ding, N Sebe… - Proceedings of the …, 2021 - openaccess.thecvf.com
While convolutional neural networks have shown a tremendous impact on various computer
vision tasks, they generally demonstrate limitations in explicitly modeling long-range …

LANet: Local attention embedding to improve the semantic segmentation of remote sensing images

L Ding, H Tang, L Bruzzone - IEEE Transactions on Geoscience …, 2020 - ieeexplore.ieee.org
The trade-off between feature representation power and spatial localization accuracy is
crucial for the dense classification/semantic segmentation of remote sensing images (RSIs) …