作者
Noah Arthurs, Sawyer Birnbaum, Nate Gruver
发表日期
2017
期刊
Tech. Rep.
简介
The success of a Youtube channel is driven in large part by the quality of the thumbnails chosen to represent each video. In this paper, we describe a CNN architecture for fitting the thumbnail qualities of successful videos and from there selecting the best thumbnail from the frames of a video. Accuracy on par with a human benchmark was achieved on the classification task, and the ultimate thumbnail selector picked what we deemed “reasonable” frames about 80% of the time. In depth analysis of the classifier was also performed and data augmentation was used to attempt improvements on flaws noticed. Video category information was also incorporated into a later model in an attempt to create more semantically fitting thumbnails. Ultimately, the success of augmentation and additional semantic information at selecting good frames did not differ much from earlier results but revealed promising qualitative structures in the selection task.
引用总数
2020202120222023202421121
学术搜索中的文章