[PDF][PDF] Improving statistical machine translation with a multilingual paraphrase database

RM Seraj, M Siahbani, A Sarkar - Proceedings of the 2015 …, 2015 - aclanthology.org
Proceedings of the 2015 Conference on Empirical Methods in Natural …, 2015aclanthology.org
Abstract The multilingual Paraphrase Database (PPDB) is a freely available automatically
created resource of paraphrases in multiple languages. In statistical machine translation,
paraphrases can be used to provide translation for out-of-vocabulary (OOV) phrases. In this
paper, we show that a graph propagation approach that uses PPDB paraphrases can be
used to improve overall translation quality. We provide an extensive comparison with
previous work and show that our PPDB-based method improves the BLEU score by up to …
Abstract
The multilingual Paraphrase Database (PPDB) is a freely available automatically created resource of paraphrases in multiple languages. In statistical machine translation, paraphrases can be used to provide translation for out-of-vocabulary (OOV) phrases. In this paper, we show that a graph propagation approach that uses PPDB paraphrases can be used to improve overall translation quality. We provide an extensive comparison with previous work and show that our PPDB-based method improves the BLEU score by up to 1.79 percent points. We show that our approach improves on the state of the art in three different settings: when faced with limited amount of parallel training data; a domain shift between training and test data; and handling a morphologically complex source language. Our PPDB-based method outperforms the use of distributional profiles from monolingual source data.
aclanthology.org
以上显示的是最相近的搜索结果。 查看全部搜索结果