Mm-soc: Benchmarking multimodal large language models in social media platforms
Social media platforms are hubs for multimodal information exchange, encompassing text,
images, and videos, making it challenging for machines to comprehend the information or …
images, and videos, making it challenging for machines to comprehend the information or …
UV-IDM: Identity-Conditioned Latent Diffusion Model for Face UV-Texture Generation
Abstract 3D face reconstruction aims at generating high-fidelity 3D face shapes and textures
from single-view or multi-view images. However current prevailing facial texture generation …
from single-view or multi-view images. However current prevailing facial texture generation …
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models
Image editing aims to edit the given synthetic or real image to meet the specific requirements
from users. It is widely studied in recent years as a promising and challenging field of …
from users. It is widely studied in recent years as a promising and challenging field of …
DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays
X Liu, Z Qiao, R Liu, H Li, J Zhang, X Zhen… - arXiv preprint arXiv …, 2024 - arxiv.org
Computed tomography (CT) is widely utilized in clinical settings because it delivers detailed
3D images of the human body. However, performing CT scans is not always feasible due to …
3D images of the human body. However, performing CT scans is not always feasible due to …
EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
Diffusion models have significantly improved the performance of image editing. Existing
methods realize various approaches to achieve high-quality image editing, including but not …
methods realize various approaches to achieve high-quality image editing, including but not …
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's
training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs …
training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs …
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
R Zhao, Q Fan, F Kou, S Qin, H Gu, W Wu, P Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, instruction-based image editing methods have garnered significant attention
in image editing. However, despite encompassing a wide range of editing priors, these …
in image editing. However, despite encompassing a wide range of editing priors, these …
Varying Manifolds in Diffusion: From Time-varying Geometries to Visual Saliency
Deep generative models learn the data distribution, which is concentrated on a low-
dimensional manifold. The geometric analysis of distribution transformation provides a better …
dimensional manifold. The geometric analysis of distribution transformation provides a better …
Stable-Hair: Real-World Hair Transfer via Diffusion Model
Current hair transfer methods struggle to handle diverse and intricate hairstyles, thus limiting
their applicability in real-world scenarios. In this paper, we propose a novel diffusion-based …
their applicability in real-world scenarios. In this paper, we propose a novel diffusion-based …