[PDF][PDF] Extending the Gold Standard for a Lexical Substitution Task: is it worth it?

L Tanguy, C Fabre, L Rivière - Proceedings of the Eleventh …, 2018 - aclanthology.org
Proceedings of the Eleventh International Conference on Language …, 2018aclanthology.org
We present a new evaluation scheme for the lexical substitution task. Following (McCarthy
and Navigli, 2007) we conducted an annotation task for French that mixes two datasets: in
the first one, 300 sentences containing a target word (among 30 different) were submitted to
annotators who were asked to provide substitutes. The second one contains the
propositions of the systems that participated to the lexical substitution task based on the
same data. The idea is first, to assess the capacity of the systems to provide good substitutes …
Abstract
We present a new evaluation scheme for the lexical substitution task. Following (McCarthy and Navigli, 2007) we conducted an annotation task for French that mixes two datasets: in the first one, 300 sentences containing a target word (among 30 different) were submitted to annotators who were asked to provide substitutes. The second one contains the propositions of the systems that participated to the lexical substitution task based on the same data. The idea is first, to assess the capacity of the systems to provide good substitutes that would not have been proposed by the annotators and second, to measure the impact on the task evaluation of a new gold standard that incorporates these additional data. While (McCarthy and Navigli, 2009) have conducted a similar post hoc analysis, re-evaluation of the systems’ performances has not been carried out to our knowledge. This experiment shows interesting differences between the two resulting datasets and gives insight on how automatically retrieved substitutes can provide complementary data to a lexical production task, without however a major impact on the evaluation of the systems.
aclanthology.org
以上显示的是最相近的搜索结果。 查看全部搜索结果