Parameter-efficient fine-tuning for large models: A comprehensive survey

Z Han, C Gao, J Liu, SQ Zhang - arXiv preprint arXiv:2403.14608, 2024 - arxiv.org
Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …

Ssr-encoder: Encoding selective subject representation for subject-driven generation

Y Zhang, Y Song, J Liu, R Wang, J Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advancements in subject-driven image generation have led to zero-shot generation
yet precise selection and focus on crucial subject representations remain challenging …

Zone: Zero-shot instruction-guided local editing

S Li, B Zeng, Y Feng, S Gao, X Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advances in vision-language models like Stable Diffusion have shown remarkable
power in creative image synthesis and editing. However most existing text-to-image editing …

Mm-soc: Benchmarking multimodal large language models in social media platforms

Y Jin, M Choi, G Verma, J Wang, S Kumar - arXiv preprint arXiv …, 2024 - arxiv.org
Social media platforms are hubs for multimodal information exchange, encompassing text,
images, and videos, making it challenging for machines to comprehend the information or …

Tuning-free inversion-enhanced control for consistent image editing

X Duan, S Cui, G Kang, B Zhang, Z Fei, M Fan… - Proceedings of the …, 2024 - ojs.aaai.org
Consistent editing of real images is a challenging task, as it requires performing non-rigid
edits (eg, changing postures) to the main objects in the input image without changing their …

UV-IDM: Identity-Conditioned Latent Diffusion Model for Face UV-Texture Generation

H Li, Y Feng, S Xue, X Liu, B Zeng… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract 3D face reconstruction aims at generating high-fidelity 3D face shapes and textures
from single-view or multi-view images. However current prevailing facial texture generation …

MVD^ 2: Efficient Multiview 3D Reconstruction for Multiview Diffusion

XY Zheng, H Pan, YX Guo, X Tong, Y Liu - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
Multiview diffusion (MVD) has emerged as a prominent 3D generation technique, acclaimed
for its generalizability, quality, and efficiency. MVD models finetune image diffusion models …

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

Z Wu, C Yu, Y Jiang, C Cao, F Wang, X Bai - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advances in 2D/3D generative models enable the generation of dynamic 3D objects
from a single-view video. Existing approaches utilize score distillation sampling to form the …

SketchDream: Sketch-based text-to-3D generation and editing

FL Liu, H Fu, YK Lai, L Gao - ACM Transactions on Graphics (TOG), 2024 - dl.acm.org
Existing text-based 3D generation methods generate attractive results but lack detailed
geometry control. Sketches, known for their conciseness and expressiveness, have …

MVHuman: Tailoring 2D Diffusion with Multi-view Sampling For Realistic 3D Human Generation

S Jiang, H Luo, H Jiang, Z Wang, J Yu, L Xu - arXiv preprint arXiv …, 2023 - arxiv.org
Recent months have witnessed rapid progress in 3D generation based on diffusion models.
Most advances require fine-tuning existing 2D Stable Diffsuions into multi-view settings or …