作者
Jorge Sánchez, Florent Perronnin, Thomas Mensink, Jakob Verbeek
发表日期
2013/12/1
期刊
International journal of computer vision (IJCV)
卷号
105
期号
3
页码范围
222-245
简介
A standard approach to describe an image for classification and retrieval purposes is to extract a set of local patch descriptors, encode them into a high dimensional vector and pool them into an image-level signature. The most common patch encoding strategy consists in quantizing the local descriptors into a finite set of prototypical elements. This leads to the popular Bag-of-Visual words representation. In this work, we propose to use the Fisher Kernel framework as an alternative patch encoding strategy: we describe patches by their deviation from an “universal” generative Gaussian mixture model. This representation, which we call Fisher vector has many advantages: it is efficient to compute, it leads to excellent results even with efficient linear classifiers, and it can be compressed with a minimal loss of accuracy using product quantization. We report experimental results on five standard datasets …
引用总数
201320142015201620172018201920202021202220232024211052012852802232151791261238528
学术搜索中的文章
J Sánchez, F Perronnin, T Mensink, J Verbeek - International journal of computer vision, 2013