DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection
We propose a new paradigm to automatically generate training data with accurate labels at
scale using the text-toimage synthesis frameworks (eg, DALL-E, Stable Diffusion, etc.). The …
scale using the text-toimage synthesis frameworks (eg, DALL-E, Stable Diffusion, etc.). The …
Interpretability for reliable, efficient, and self-cognitive DNNs: From theories to applications
In recent years, remarkable achievements have been made in artificial intelligence tasks
and applications based on deep neural networks (DNNs), especially in the fields of vision …
and applications based on deep neural networks (DNNs), especially in the fields of vision …
Neural-sim: Learning to generate training data with nerf
Training computer vision models usually requires collecting and labeling vast amounts of
imagery under a diverse set of scene configurations and properties. This process is …
imagery under a diverse set of scene configurations and properties. This process is …
Contributions of shape, texture, and color in visual recognition
We investigate the contributions of three important features of the human visual system
(HVS)—shape, texture, and color—to object classification. We build a humanoid vision …
(HVS)—shape, texture, and color—to object classification. We build a humanoid vision …
Learning degradation-invariant representation for robust real-world person re-identification
Person re-identification (Re-ID) in real-world scenarios suffers from various degradations,
eg, low resolution, weak lighting, and bad weather. These degradations hinders identity …
eg, low resolution, weak lighting, and bad weather. These degradations hinders identity …
Beyond generation: Harnessing text to image models for object detection and segmentation
We propose a new paradigm to automatically generate training data with accurate labels at
scale using the text-to-image synthesis frameworks (eg, DALL-E, Stable Diffusion, etc.). The …
scale using the text-to-image synthesis frameworks (eg, DALL-E, Stable Diffusion, etc.). The …
[PDF][PDF] Compositional Zero-Shot Artistic Font Synthesis.
X Li, L Wu, C Wang, L Meng, X Meng - IJCAI, 2023 - ijcai.org
Recently, many researchers have made remarkable achievements in the field of artistic font
synthesis, with impressive glyph style and effect style in the results. However, due to less …
synthesis, with impressive glyph style and effect style in the results. However, due to less …
Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt
Z Ding, P Li, Q Yang, S Li - arXiv preprint arXiv:2406.01956, 2024 - arxiv.org
This paper presents a novel approach to enhance image-to-image generation by leveraging
the multimodal capabilities of the Large Language and Vision Assistant (LLaVA). We …
the multimodal capabilities of the Large Language and Vision Assistant (LLaVA). We …
Improving disentangled representation learning for gait recognition using group supervision
L Yao, W Kusakunniran, P Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
For decades, gait has been gathering extensive interest due to the advantage that it can be
measured from a distance without physical contact. However, for image/video-based gait …
measured from a distance without physical contact. However, for image/video-based gait …
Juxtaform: interactive visual summarization for exploratory shape design
We present juxtaform, a novel approach to the interactive summarization of large shape
collections for conceptual shape design. We conduct a formative study to ascertain design …
collections for conceptual shape design. We conduct a formative study to ascertain design …