Mm-soc: Benchmarking multimodal large language models in social media platforms

Y Jin, M Choi, G Verma, J Wang, S Kumar - arXiv preprint arXiv …, 2024 - arxiv.org
Social media platforms are hubs for multimodal information exchange, encompassing text,
images, and videos, making it challenging for machines to comprehend the information or …

UV-IDM: Identity-Conditioned Latent Diffusion Model for Face UV-Texture Generation

H Li, Y Feng, S Xue, X Liu, B Zeng… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract 3D face reconstruction aims at generating high-fidelity 3D face shapes and textures
from single-view or multi-view images. However current prevailing facial texture generation …

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

X Shuai, H Ding, X Ma, R Tu, YG Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org
Image editing aims to edit the given synthetic or real image to meet the specific requirements
from users. It is widely studied in recent years as a promising and challenging field of …

DiffuX2CT: Diffusion Learning to Reconstruct CT Images from Biplanar X-Rays

X Liu, Z Qiao, R Liu, H Li, J Zhang, X Zhen… - arXiv preprint arXiv …, 2024 - arxiv.org
Computed tomography (CT) is widely utilized in clinical settings because it delivers detailed
3D images of the human body. However, performing CT scans is not always feasible due to …

EditWorld: Simulating World Dynamics for Instruction-Following Image Editing

L Yang, B Zeng, J Liu, H Li, M Xu, W Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models have significantly improved the performance of image editing. Existing
methods realize various approaches to achieve high-quality image editing, including but not …

Model Inversion Attacks Through Target-Specific Conditional Diffusion Models

O Li, Y Hao, Z Wang, B Zhu, S Wang, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's
training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs …

InstructBrush: Learning Attention-based Instruction Optimization for Image Editing

R Zhao, Q Fan, F Kou, S Qin, H Gu, W Wu, P Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, instruction-based image editing methods have garnered significant attention
in image editing. However, despite encompassing a wide range of editing priors, these …

Varying Manifolds in Diffusion: From Time-varying Geometries to Visual Saliency

J Chen, M Li, Z Pan, X Gao, C Tu - arXiv preprint arXiv:2406.18588, 2024 - arxiv.org
Deep generative models learn the data distribution, which is concentrated on a low-
dimensional manifold. The geometric analysis of distribution transformation provides a better …

Stable-Hair: Real-World Hair Transfer via Diffusion Model

Y Zhang, Q Zhang, Y Song, J Liu - arXiv preprint arXiv:2407.14078, 2024 - arxiv.org
Current hair transfer methods struggle to handle diverse and intricate hairstyles, thus limiting
their applicability in real-world scenarios. In this paper, we propose a novel diffusion-based …