Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities
Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …
that does not require the collection of raw training data and does not require expensive …
Badmerging: Backdoor attacks against model merging
Fine-tuning pre-trained models for downstream tasks has led to a proliferation of open-
sourced task-specific models. Recently, Model Merging (MM) has emerged as an effective …
sourced task-specific models. Recently, Model Merging (MM) has emerged as an effective …
Scalable ranked preference optimization for text-to-image generation
Direct Preference Optimization (DPO) has emerged as a powerful approach to align text-to-
image (T2I) models with human feedback. Unfortunately, successful application of DPO to …
image (T2I) models with human feedback. Unfortunately, successful application of DPO to …
Generate Any Scene: Evaluating and Improving Text-to-Vision Generation with Scene Graph Programming
DALL-E and Sora have gained attention by producing implausible images, such as"
astronauts riding a horse in space." Despite the proliferation of text-to-vision models that …
astronauts riding a horse in space." Despite the proliferation of text-to-vision models that …
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Recent advancements in generation models have showcased remarkable capabilities in
generating fantastic content. However, most of them are trained on proprietary high-quality …
generating fantastic content. However, most of them are trained on proprietary high-quality …
Camera Settings as Tokens: Modeling Photography on Latent Diffusion Models
IS Fang, YH Han, JC Chen - SIGGRAPH Asia 2024 Conference Papers, 2024 - dl.acm.org
Text-to-image models have revolutionized content creation, enabling users to generate
images from natural language prompts. While recent advancements in conditioning these …
images from natural language prompts. While recent advancements in conditioning these …
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation
Storytelling video generation (SVG) has recently emerged as a task to create long, multi-
motion, multi-scene videos that consistently represent the story described in the input text …
motion, multi-scene videos that consistently represent the story described in the input text …
VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement
Recent text-to-video (T2V) diffusion models have demonstrated impressive generation
capabilities across various domains. However, these models often generate videos that …
capabilities across various domains. However, these models often generate videos that …