作者
Danny Mitry, Tunde Peto, Shabina Hayat, James E Morgan, Kay-Tee Khaw, Paul J Foster
发表日期
2013/8/21
期刊
PloS one
卷号
8
期号
8
页码范围
e71154
出版商
Public Library of Science
简介
Aim
Crowdsourcing is the process of outsourcing numerous tasks to many untrained individuals. Our aim was to assess the performance and repeatability of crowdsourcing for the classification of retinal fundus photography.
Methods
One hundred retinal fundus photograph images with pre-determined disease criteria were selected by experts from a large cohort study. After reading brief instructions and an example classification, we requested that knowledge workers (KWs) from a crowdsourcing platform classified each image as normal or abnormal with grades of severity. Each image was classified 20 times by different KWs. Four study designs were examined to assess the effect of varying incentive and KW experience in classification accuracy. All study designs were conducted twice to examine repeatability. Performance was assessed by comparing the sensitivity, specificity and area under the receiver operating characteristic curve (AUC).
Results
Without restriction on eligible participants, two thousand classifications of 100 images were received in under 24 hours at minimal cost. In trial 1 all study designs had an AUC (95%CI) of 0.701(0.680–0.721) or greater for classification of normal/abnormal. In trial 1, the highest AUC (95%CI) for normal/abnormal classification was 0.757 (0.738–0.776) for KWs with moderate experience. Comparable results were observed in trial 2. In trial 1, between 64–86% of any abnormal image was correctly classified by over half of all KWs. In trial 2, this ranged between 74–97%. Sensitivity was ≥96% for normal versus severely abnormal detections across all trials. Sensitivity for normal versus mildly abnormal …
引用总数
2014201520162017201820192020202120222023202471181410594243