LIUM-CVC submissions for WMT17 multimodal translation task

X Wang, J Wu, J Chen, L Li… - Proceedings of the …, 2019 - openaccess.thecvf.com

We present a new large-scale multilingual video description dataset, VATEX, which contains
over 41,250 videos and 825,000 captions in both English and Chinese. Among the captions …

被引用次数：598 相关文章所有 8 个版本

[PDF] arxiv.org

Findings of the second shared task on multimodal machine translation and multilingual image description

D Elliott, S Frank, L Barrault, F Bougares… - arXiv preprint arXiv …, 2017 - arxiv.org

We present the results from the second shared task on multimodal machine translation and
multilingual image description. Nine teams submitted 19 systems to two tasks. The …

被引用次数：255 相关文章所有 10 个版本

[PDF] arxiv.org

A novel graph-based multi-modal fusion encoder for neural machine translation

Y Yin, F Meng, J Su, C Zhou, Z Yang, J Zhou… - arXiv preprint arXiv …, 2020 - arxiv.org

Multi-modal neural machine translation (NMT) aims to translate source sentences into a
target language paired with images. However, dominant multi-modal NMT models do not …

被引用次数：157 相关文章所有 5 个版本

[PDF] jair.org Full View

Trends in integration of vision and language research: A survey of tasks, datasets, and methods

A Mogadala, M Kalimuthu, D Klakow - Journal of Artificial Intelligence …, 2021 - jair.org

Abstract Interest in Artificial Intelligence (AI) and its applications has seen unprecedented
growth in the last few years. This success can be partly attributed to the advancements made …

被引用次数：158 相关文章所有 8 个版本

[PDF] arxiv.org

Probing the need for visual context in multimodal machine translation

O Caglayan, P Madhyastha, L Specia… - arXiv preprint arXiv …, 2019 - arxiv.org

Current work on multimodal machine translation (MMT) has suggested that the visual
modality is either unnecessary or only marginally beneficial. We posit that this is a …

被引用次数：168 相关文章所有 10 个版本

[PDF] openreview.net

Neural machine translation with universal visual representation

Z Zhang, K Chen, R Wang, M Utiyama… - International …, 2020 - openreview.net

Though visual information has been introduced for enhancing neural machine translation
(NMT), its effectiveness strongly relies on the availability of large amounts of bilingual …

被引用次数：126 相关文章

[PDF] arxiv.org

Dynamic context-guided capsule network for multimodal machine translation

H Lin, F Meng, J Su, Y Yin, Z Yang, Y Ge… - Proceedings of the 28th …, 2020 - dl.acm.org

Multimodal machine translation (MMT), which mainly focuses on enhancing text-only
translation with visual features, has attracted considerable attention from both computer …

被引用次数：85 相关文章所有 4 个版本

[PDF] arxiv.org

Neural machine translation with phrase-level universal visual representations

Q Fang, Y Feng - arXiv preprint arXiv:2203.10299, 2022 - arxiv.org

Multimodal machine translation (MMT) aims to improve neural machine translation (NMT)
with additional visual information, but most existing MMT methods require paired input of …

被引用次数：39 相关文章所有 4 个版本

[PDF] arxiv.org

Distilling translations with visual awareness

J Ive, P Madhyastha, L Specia - arXiv preprint arXiv:1906.07701, 2019 - arxiv.org

Previous work on multimodal machine translation has shown that visual information is only
needed in very specific cases, for example in the presence of ambiguous words where the …

被引用次数：97 相关文章所有 5 个版本

[PDF] springer.com

Multimodal machine translation through visuals and speech

U Sulubacak, O Caglayan, SA Grönroos, A Rouhe… - Machine …, 2020 - Springer

Multimodal machine translation involves drawing information from more than one modality,
based on the assumption that the additional modalities will contain useful alternative views …

被引用次数：87 相关文章所有 18 个版本