Video (language) modeling: a baseline for generative models of natural videos

MA Ranzato, A Szlam, J Bruna, M Mathieu… - arXiv preprint arXiv …, 2014 - arxiv.org
We propose a strong baseline model for unsupervised feature learning using video data. By
learning to predict missing frames or extrapolate future frames from an input video
sequence, the model discovers both spatial and temporal correlations which are useful to
represent complex deformations and motion patterns. The models we propose are largely
borrowed from the language modeling literature, and adapted to the vision domain by
quantizing the space of image patches into a large dictionary. We demonstrate the approach …

[引用][C] Video (language) modeling: A baseline for generative models of natural videos. arXiv 2014

M Ranzato, A Szlam, J Bruna, M Mathieu, R Collobert… - arXiv preprint arXiv:1412.6604
以上显示的是最相近的搜索结果。 查看全部搜索结果