Vatex: A large-scale, high-quality multilingual dataset for video-and-language research

X Wang, J Wu, J Chen, L Li… - Proceedings of the …, 2019 - openaccess.thecvf.com
We present a new large-scale multilingual video description dataset, VATEX, which contains
over 41,250 videos and 825,000 captions in both English and Chinese. Among the captions …

Findings of the second shared task on multimodal machine translation and multilingual image description

D Elliott, S Frank, L Barrault, F Bougares… - arXiv preprint arXiv …, 2017 - arxiv.org
We present the results from the second shared task on multimodal machine translation and
multilingual image description. Nine teams submitted 19 systems to two tasks. The …

Probing the need for visual context in multimodal machine translation

O Caglayan, P Madhyastha, L Specia… - arXiv preprint arXiv …, 2019 - arxiv.org
Current work on multimodal machine translation (MMT) has suggested that the visual
modality is either unnecessary or only marginally beneficial. We posit that this is a …

OpenNMT: Neural machine translation toolkit

G Klein, Y Kim, Y Deng, V Nguyen, J Senellart… - arXiv preprint arXiv …, 2018 - arxiv.org
OpenNMT is an open-source toolkit for neural machine translation (NMT). The system
prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research …

Multimodal machine translation through visuals and speech

U Sulubacak, O Caglayan, SA Grönroos, A Rouhe… - Machine …, 2020 - Springer
Multimodal machine translation involves drawing information from more than one modality,
based on the assumption that the additional modalities will contain useful alternative views …

A visual attention grounding neural model for multimodal machine translation

M Zhou, R Cheng, YJ Lee, Z Yu - arXiv preprint arXiv:1808.08266, 2018 - arxiv.org
We introduce a novel multimodal machine translation model that utilizes parallel visual and
textual information. Our model jointly optimizes the learning of a shared visual-language …

Context-aware learning for neural machine translation

S Jean, K Cho - arXiv preprint arXiv:1903.04715, 2019 - arxiv.org
Interest in larger-context neural machine translation, including document-level and multi-
modal translation, has been growing. Multiple works have proposed new network …

Multimodal machine translation

O Caglayan - 2019 - theses.hal.science
Machine translation aims at automatically translating documents from one language to
another without human intervention. With the advent of deep neural networks (DNN), neural …

Ensemble sequence level training for multimodal MT: OSU-Baidu WMT18 multimodal machine translation system report

R Zheng, Y Yang, M Ma, L Huang - arXiv preprint arXiv:1808.10592, 2018 - arxiv.org
This paper describes multimodal machine translation systems developed jointly by Oregon
State University and Baidu Research for WMT 2018 Shared Task on multimodal translation …

An Implementation of a System for Video Translation Using OCR

SM Hwang, HG Yeom - Software Engineering in IoT, Big Data, Cloud and …, 2021 - Springer
As the machine learning research has developed, the field of translation and image analysis
such as optical character recognition has made great progress. However, video translation …