Self-supervised image-to-text and text-to-image synthesis

AS Das, S Saha - … : 28th International Conference, ICONIP 2021, Sanur …, 2021 - Springer
A comprehensive understanding of vision and language and their interrelation are crucial to
realize the underlying similarities and differences between these modalities and to learn
more generalized, meaningful representations. In recent years, most of the works related to
Text-to-Image synthesis and Image-to-Text generation, focused on supervised generative
deep architectures to solve the problems, where very little interest was placed on learning
the similarities between the embedding spaces across modalities. In this paper, we propose …

Self-Supervised Image-to-Text and Text-to-Image Synthesis

A Sundar Das, S Saha - arXiv e-prints, 2021 - ui.adsabs.harvard.edu
A comprehensive understanding of vision and language and their interrelation are crucial to
realize the underlying similarities and differences between these modalities and to learn
more generalized, meaningful representations. In recent years, most of the works related to
Text-to-Image synthesis and Image-to-Text generation, focused on supervised generative
deep architectures to solve the problems, where very little interest was placed on learning
the similarities between the embedding spaces across modalities. In this paper, we propose …
以上显示的是最相近的搜索结果。 查看全部搜索结果