DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

Y Ge, J Xu, BN Zhao, N Joshi, L Itti, V Vineet - arXiv preprint arXiv …, 2022 - arxiv.org
We propose a new paradigm to automatically generate training data with accurate labels at
scale using the text-toimage synthesis frameworks (eg, DALL-E, Stable Diffusion, etc.). The …

Interpretability for reliable, efficient, and self-cognitive DNNs: From theories to applications

X Kang, J Guo, B Song, B Cai, H Sun, Z Zhang - Neurocomputing, 2023 - Elsevier
In recent years, remarkable achievements have been made in artificial intelligence tasks
and applications based on deep neural networks (DNNs), especially in the fields of vision …

Neural-sim: Learning to generate training data with nerf

Y Ge, H Behl, J Xu, S Gunasekar, N Joshi… - … on Computer Vision, 2022 - Springer
Training computer vision models usually requires collecting and labeling vast amounts of
imagery under a diverse set of scene configurations and properties. This process is …

Contributions of shape, texture, and color in visual recognition

Y Ge, Y Xiao, Z Xu, X Wang, L Itti - European Conference on Computer …, 2022 - Springer
We investigate the contributions of three important features of the human visual system
(HVS)—shape, texture, and color—to object classification. We build a humanoid vision …

Learning degradation-invariant representation for robust real-world person re-identification

Y Huang, X Fu, L Li, ZJ Zha - International Journal of Computer Vision, 2022 - Springer
Person re-identification (Re-ID) in real-world scenarios suffers from various degradations,
eg, low resolution, weak lighting, and bad weather. These degradations hinders identity …

Beyond generation: Harnessing text to image models for object detection and segmentation

Y Ge, J Xu, BN Zhao, N Joshi, L Itti, V Vineet - arXiv preprint arXiv …, 2023 - arxiv.org
We propose a new paradigm to automatically generate training data with accurate labels at
scale using the text-to-image synthesis frameworks (eg, DALL-E, Stable Diffusion, etc.). The …

[PDF][PDF] Compositional Zero-Shot Artistic Font Synthesis.

X Li, L Wu, C Wang, L Meng, X Meng - IJCAI, 2023 - ijcai.org
Recently, many researchers have made remarkable achievements in the field of artistic font
synthesis, with impressive glyph style and effect style in the results. However, due to less …

Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt

Z Ding, P Li, Q Yang, S Li - arXiv preprint arXiv:2406.01956, 2024 - arxiv.org
This paper presents a novel approach to enhance image-to-image generation by leveraging
the multimodal capabilities of the Large Language and Vision Assistant (LLaVA). We …

Improving disentangled representation learning for gait recognition using group supervision

L Yao, W Kusakunniran, P Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
For decades, gait has been gathering extensive interest due to the advantage that it can be
measured from a distance without physical contact. However, for image/video-based gait …

Juxtaform: interactive visual summarization for exploratory shape design

K Pandey, F Chevalier, K Singh - ACM Transactions on Graphics (TOG), 2023 - dl.acm.org
We present juxtaform, a novel approach to the interactive summarization of large shape
collections for conceptual shape design. We conduct a formative study to ascertain design …