Deep Vision Transformer and T5-Based for Image Captioning

KN Lam, HT Nguyen, VP Mai… - 2023 RIVF International …, 2023 - ieeexplore.ieee.org
Automatically creating description sentences for images is a task that involves aligning
image under-standing with natural language processing. This paper presents a model for …