Dreamllm: Synergistic multimodal comprehension and creation

R Dong, C Han, Y Peng, Z Qi, Z Ge, J Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper presents DreamLLM, a learning framework that first achieves versatile
Multimodal Large Language Models (MLLMs) empowered with frequently overlooked …

Mi-gan: A simple baseline for image inpainting on mobile devices

A Sargsyan, S Navasardyan, X Xu… - Proceedings of the …, 2023 - openaccess.thecvf.com
In recent years, many deep learning based image inpainting methods have been developed
by the research community. Some of those methods have shown impressive image …

Shapellm: Universal 3d object understanding for embodied interaction

Z Qi, R Dong, S Zhang, H Geng, C Han, Z Ge… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM)
designed for embodied interaction, exploring a universal 3D object understanding with 3D …

Compressing image-to-image translation gans using local density structures on their learned manifold

A Ganjdanesh, S Gao, H Alipanah… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Generative Adversarial Networks (GANs) have shown remarkable success in modeling
complex data distributions for image-to-image translation. Still, their high computational …

Label-guided auxiliary training improves 3d object detector

Y Huang, X Liu, Y Zhu, Z Xu, C Shen, Z Che… - … on Computer Vision, 2022 - Springer
Detecting 3D objects from point clouds is a practical yet challenging task that has attracted
increasing attention recently. In this paper, we propose a Label-Guided auxiliary training …

Dreambench++: A human-aligned benchmark for personalized image generation

Y Peng, Y Cui, H Tang, Z Qi, R Dong, J Bai… - arXiv preprint arXiv …, 2024 - arxiv.org
Personalized image generation holds great promise in assisting humans in everyday work
and life due to its impressive function in creatively generating personalized content …

CoroNetGAN: Controlled Pruning of GANs via Hypernetworks

A Kumar, K Anand, S Mandloi… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Generative Adversarial Networks (GANs) have proven to exhibit remarkable
performance and are widely used across many generative computer vision applications …

Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection

L Zhang, Y Shi, HS Tai, Z Zhang, Y He, K Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Detecting 3D objects from multi-view images is a fundamental problem in 3D computer
vision. Recently, significant breakthrough has been made in multi-view 3D detection tasks …

Masked Discriminators for Content-Consistent Unpaired Image-to-Image Translation

B Stuhr, J Brauer, B Schick, J Gonzàlez - arXiv preprint arXiv:2309.13188, 2023 - arxiv.org
A common goal of unpaired image-to-image translation is to preserve content consistency
between source images and translated images while mimicking the style of the target …

[PDF][PDF] Structured Knowledge Distillation Towards Efficient Multi-View 3D Object Detection.

L Zhang, Y Shi, K Wang, Z Zhang, HS Tai, Y He… - BMVC, 2023 - papers.bmvc2023.org
Detecting 3D objects from multi-view images is a fundamental problem in 3D computer
vision. Recently, significant breakthrough has been made in multi-view 3D detection tasks …