Tiny video networks - 学术资源搜索

A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions

SK Yadav, K Tiwari, HM Pandey, SA Akbar - Knowledge-Based Systems, 2021 - Elsevier

Human activity recognition (HAR) is one of the most important and challenging problems in
the computer vision. It has critical application in wide variety of tasks including gaming …

被引用次数：252 相关文章所有 3 个版本

[PDF] arxiv.org

A comprehensive survey on hardware-aware neural architecture search

H Benmeziane, KE Maghraoui, H Ouarnoughi… - arXiv preprint arXiv …, 2021 - arxiv.org

Neural Architecture Search (NAS) methods have been growing in popularity. These
techniques have been fundamental to automate and speed up the time consuming and error …

被引用次数：104 相关文章所有 4 个版本

[PDF] thecvf.com

Movinets: Mobile video networks for efficient video recognition

D Kondratyuk, L Yuan, Y Li, L Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract We present Mobile Video Networks (MoViNets), a family of computation and
memory efficient video networks that can operate on streaming video for online inference …

被引用次数：286 相关文章所有 8 个版本

[PDF] thecvf.com

Vidtr: Video transformer without convolutions

Y Zhang, X Li, C Liu, B Shuai, Y Zhu… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract We introduce Video Transformer (VidTr) with separable-attention for video
classification. Comparing with commonly used 3D networks, VidTr is able to aggregate …

被引用次数：207 相关文章所有 11 个版本

[PDF] arxiv.org

A comprehensive study of deep video action recognition

Y Zhu, X Li, C Liu, M Zolfaghari, Y Xiong, C Wu… - arXiv preprint arXiv …, 2020 - arxiv.org

Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …

被引用次数：232 相关文章所有 2 个版本

[PDF] arxiv.org

Enable deep learning on mobile devices: Methods, systems, and applications

H Cai, J Lin, Y Lin, Z Liu, H Tang, H Wang… - ACM Transactions on …, 2022 - dl.acm.org

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial
intelligence (AI), including computer vision, natural language processing, and speech …

被引用次数：121 相关文章所有 6 个版本

[PDF] arxiv.org

Ar-net: Adaptive frame resolution for efficient action recognition

Y Meng, CC Lin, R Panda, P Sattigeri… - Computer Vision–ECCV …, 2020 - Springer

Action recognition is an open and challenging problem in computer vision. While current
state-of-the-art models offer excellent recognition results, their computational expense limits …

被引用次数：174 相关文章所有 8 个版本

[PDF] arxiv.org

Tokenlearner: What can 8 learned tokens do for images and videos?

MS Ryoo, AJ Piergiovanni, A Arnab… - arXiv preprint arXiv …, 2021 - arxiv.org

In this paper, we introduce a novel visual representation learning which relies on a handful
of adaptively learned tokens, and which is applicable to both image and video …

被引用次数：127 相关文章所有 2 个版本

[PDF] thecvf.com

Can weight sharing outperform random architecture search? an investigation with tunas

G Bender, H Liu, B Chen, G Chu… - Proceedings of the …, 2020 - openaccess.thecvf.com

Abstract Efficient Neural Architecture Search methods based on weight sharing have shown
good promise in democratizing Neural Architecture Search for computer vision models …

被引用次数：160 相关文章所有 8 个版本

[PDF] thecvf.com

Frameexit: Conditional early exiting for efficient video recognition

A Ghodrati, BE Bejnordi… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

In this paper, we propose a conditional early exiting framework for efficient video
recognition. While existing works focus on selecting a subset of salient frames to reduce the …

被引用次数：90 相关文章所有 5 个版本