查看文章

neurips.cc 中的 [PDF]

Recurrent models of visual attention

作者

Volodymyr Mnih, Nicolas Heess, Alex Graves

发表日期

2014

研讨会论文

Advances in neural information processing systems

页码范围

2204-2212

简介

Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is capable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it performs can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so.

引用总数

被引用次数：4701

201520162017201820192020202120222023202452 191 293 415 585 660 666 703 716 351

学术搜索中的文章

Recurrent models of visual attention

V Mnih, N Heess, A Graves - Advances in neural information processing systems, 2014

被引用次数：4701 相关文章所有 13 个版本