[PDF][PDF] Emory University at TREC LiveQA 2016: Combining Crowdsourcing and Learning-To-Rank Approaches for Real-Time Complex Question Answering.

D Savenkov, E Agichtein - TREC, 2016 - denxx.github.io
TREC, 2016denxx.github.io
This paper describes the two QA systems we developed to participate in the TREC LiveQA
2016 shared task. The first run represents an improvement of our fully automatic real-time
QA system from LiveQA 2015, Emory-QA. The second run, Emory-CRQA, which stands for
Crowd-powered Real-time Question Answering, incorporates human feedback, in real-time,
to improve answer candidate generation and ranking. The base Emory-QA system uses the
title and the body of a question to query Yahoo! Answers, Answers. com, WikiHow and …
Abstract
This paper describes the two QA systems we developed to participate in the TREC LiveQA 2016 shared task. The first run represents an improvement of our fully automatic real-time QA system from LiveQA 2015, Emory-QA. The second run, Emory-CRQA, which stands for Crowd-powered Real-time Question Answering, incorporates human feedback, in real-time, to improve answer candidate generation and ranking. The base Emory-QA system uses the title and the body of a question to query Yahoo! Answers, Answers. com, WikiHow and general web search and retrieve a set of candidate answers along with their topics and contexts. This information is used to represent each candidate by a set of features, rank them with a trained LambdaMART model, and return the top ranked candidates as an answer to the question. The second run, Emory-CRQA, integrates a crowdsourcing module, which provides the system with additional answer candidates and quality ratings, obtained in near real-time (under one minute) from a crowd of workers When Emory-CRQA receives a question, it is forwarded to the crowd, who can start working on the answer in parallel with the automatic pipeline. When the automatic pipeline is done generating and ranking candidates, a subset of them is immediately sent to the same workers who have been working on answering the questions. Workers than rate the quality of all human-or systemgenerated candidate answers. The resulting ratings as well as original system scores are used as features for the final re-ranking module, which returns the highest scoring answer. The official run results of the tasks indicate promising improvements for both runs compared to the best performing system from LiveQA 2015. Additionally, they demonstrate the effectiveness of the introduced crowdsourcing module, which allowed us to achieve an improvement of∼ 20% in average answer score over a fully automatic Emory-QA system.
denxx.github.io
以上显示的是最相近的搜索结果。 查看全部搜索结果