Incorporating semantic consistency for improved semi-supervised image captioning

B Wu, Y Wo - Multimedia Tools and Applications, 2024 - Springer
The high labor cost of image captioning datasets limits the application scenarios of image
captioning methods. Therefore, the semi-supervised image captioning research that utilizes …

Ifcap: Image-like retrieval and frequency-based entity filtering for zero-shot captioning

S Lee, SW Kim, T Kim, DJ Kim - arXiv preprint arXiv:2409.18046, 2024 - arxiv.org
Recent advancements in image captioning have explored text-only training methods to
overcome the limitations of paired image-text data. However, existing text-only training …

Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality

Y Oh, JW Cho, DJ Kim, IS Kweon, J Kim - arXiv preprint arXiv:2410.05210, 2024 - arxiv.org
In this paper, we propose a new method to enhance compositional understanding in pre-
trained vision and language models (VLMs) without sacrificing performance in zero-shot …

Generative adversarial network for semi-supervised image captioning

X Liang, C Li, L Tian - Computer Vision and Image Understanding, 2024 - Elsevier
Traditional supervised image captioning methods usually rely on a large number of images
and paired captions for training. However, the creation of such datasets necessitates …

Style-Enhanced Transformer for Image Captioning in Construction Scenes

K Song, L Chen, H Wang - Entropy, 2024 - mdpi.com
Image captioning is important for improving the intelligence of construction projects and
assisting managers in mastering construction site activities. However, there are few image …

Bridging Vision and Language: Advances in Image Captioning Techniques

SP Ingale, GR Bamnote - Library Progress International, 2024 - bpasjournals.com
New trends in image captioning are the central area of interest for this paper; image
captioning is an area that applies computer vision and natural language processing to …