- 学术资源搜索

A survey of cross-lingual word embedding models

S Ruder, I Vulić, A Søgaard - Journal of Artificial Intelligence Research, 2019 - jair.org

Cross-lingual representations of words enable us to reason about word meaning in
multilingual contexts and are a key facilitator of cross-lingual transfer when developing …

被引用次数：785 相关文章所有 16 个版本

[PDF] thecvf.com

Uc2: Universal cross-lingual cross-modal vision-and-language pre-training

M Zhou, L Zhou, S Wang, Y Cheng… - Proceedings of the …, 2021 - openaccess.thecvf.com

Vision-and-language pre-training has achieved impressive success in learning multimodal
representations between vision and language. To generalize this success to non-English …

被引用次数：88 相关文章所有 9 个版本

[PDF] unimib.it

SemEval-2023 task 1: Visual word sense disambiguation

A Raganato, I Calixto, A Ushio… - … 2023-Proceedings of …, 2023 - boa.unimib.it

This paper presents the Visual Word Sense Disambiguation (Visual-WSD) task. The
objective of Visual-WSD is to identify among a set of ten images the one that corresponds to …

被引用次数：38 相关文章所有 11 个版本

[PDF] arxiv.org

A visual attention grounding neural model for multimodal machine translation

M Zhou, R Cheng, YJ Lee, Z Yu - arXiv preprint arXiv:1808.08266, 2018 - arxiv.org

We introduce a novel multimodal machine translation model that utilizes parallel visual and
textual information. Our model jointly optimizes the learning of a shared visual-language …

被引用次数：97 相关文章所有 8 个版本

[PDF] arxiv.org

Image pivoting for learning multilingual multimodal representations

S Gella, R Sennrich, F Keller, M Lapata - arXiv preprint arXiv:1707.07601, 2017 - arxiv.org

In this paper we propose a model to learn multimodal multilingual representations for
matching images and sentences in different languages, with the aim of advancing …

被引用次数：85 相关文章所有 6 个版本

[PDF] arxiv.org

Emergent translation in multi-agent communication

J Lee, K Cho, J Weston, D Kiela - arXiv preprint arXiv:1710.06922, 2017 - arxiv.org

While most machine translation systems to date are trained on large parallel corpora,
humans learn language in a different way: by being grounded in an environment and …

被引用次数：72 相关文章所有 5 个版本

[PDF] arxiv.org

Good for misconceived reasons: An empirical revisiting on the need for visual context in multimodal machine translation

Z Wu, L Kong, W Bi, X Li, B Kao - arXiv preprint arXiv:2105.14462, 2021 - arxiv.org

A neural multimodal machine translation (MMT) system is one that aims to perform better
translation by extending conventional text-only translation models with multimodal …

被引用次数：32 相关文章所有 6 个版本

[PDF] aaai.org

Mule: Multimodal universal language embedding

D Kim, K Saito, K Saenko, S Sclaroff… - Proceedings of the AAAI …, 2020 - ojs.aaai.org

Existing vision-language methods typically support two languages at a time at most. In this
paper, we present a modular approach which can easily be incorporated into existing vision …

被引用次数：36 相关文章所有 7 个版本

[PDF] arxiv.org

Towards zero-shot cross-lingual image retrieval

P Aggarwal, A Kale - arXiv preprint arXiv:2012.05107, 2020 - arxiv.org

There has been a recent spike in interest in multi-modal Language and Vision problems. On
the language side, most of these models primarily focus on English since most multi-modal …

被引用次数：22 相关文章所有 3 个版本

[PDF] arxiv.org

Multi-head attention with diversity for learning grounded multilingual multimodal representations

PY Huang, X Chang, A Hauptmann - arXiv preprint arXiv:1910.00058, 2019 - arxiv.org

With the aim of promoting and understanding the multilingual version of image search, we
leverage visual object detection and propose a model with diverse multi-head attention to …

被引用次数：25 相关文章所有 11 个版本