Exploring the major trends and emerging themes of artificial intelligence in the scientific leading journals amidst the COVID-19 era
Artificial intelligence (AI) has recently become the focus of academia and practitioners,
reflecting the substantial evolution of scientific production in this area, particularly during the …
reflecting the substantial evolution of scientific production in this area, particularly during the …
Enable deep learning on mobile devices: Methods, systems, and applications
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial
intelligence (AI), including computer vision, natural language processing, and speech …
intelligence (AI), including computer vision, natural language processing, and speech …
Adaptive focus for efficient video recognition
In this paper, we explore the spatial redundancy in video recognition with the aim to improve
the computational efficiency. It is observed that the most informative region in each frame of …
the computational efficiency. It is observed that the most informative region in each frame of …
DS-Net++: Dynamic weight slicing for efficient inference in CNNs and vision transformers
Dynamic networks have shown their promising capability in reducing theoretical
computation complexity by adapting their architectures to the input during inference …
computation complexity by adapting their architectures to the input during inference …
Stcrowd: A multimodal dataset for pedestrian perception in crowded scenes
Accurately detecting and tracking pedestrians in 3D space is challenging due to large
variations in rotations, poses and scales. The situation becomes even worse for dense …
variations in rotations, poses and scales. The situation becomes even worse for dense …
Arm: Any-time super-resolution method
This paper proposes an Any-time super-Resolution Method (ARM) to tackle the over-
parameterized single image super-resolution (SISR) models. Our ARM is motivated by three …
parameterized single image super-resolution (SISR) models. Our ARM is motivated by three …
Shuffle-invariant network for action recognition in videos
The local key features in video are important for improving the accuracy of human action
recognition. However, most end-to-end methods focus on global feature learning from …
recognition. However, most end-to-end methods focus on global feature learning from …
Content-aware rectified activation for zero-shot fine-grained image retrieval
Fine-grained image retrieval mainly focuses on learning salient features from the seen
subcategories as discriminative embedding while neglecting the problems behind zero-shot …
subcategories as discriminative embedding while neglecting the problems behind zero-shot …
A large-scale study of spatiotemporal representation learning with a new benchmark on action recognition
The goal of building a benchmark (suite of datasets) is to provide a unified protocol for fair
evaluation and thus facilitate the evolution of a specific area. Nonetheless, we point out that …
evaluation and thus facilitate the evolution of a specific area. Nonetheless, we point out that …
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Due to the resource-intensive nature of training vision-language models on expansive video
data a majority of studies have centered on adapting pre-trained image-language models to …
data a majority of studies have centered on adapting pre-trained image-language models to …