Exploring the major trends and emerging themes of artificial intelligence in the scientific leading journals amidst the COVID-19 era

M Soliman, T Fatnassi, I Elgammal… - Big Data and Cognitive …, 2023 - mdpi.com
Artificial intelligence (AI) has recently become the focus of academia and practitioners,
reflecting the substantial evolution of scientific production in this area, particularly during the …

Enable deep learning on mobile devices: Methods, systems, and applications

H Cai, J Lin, Y Lin, Z Liu, H Tang, H Wang… - ACM Transactions on …, 2022 - dl.acm.org
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial
intelligence (AI), including computer vision, natural language processing, and speech …

Adaptive focus for efficient video recognition

Y Wang, Z Chen, H Jiang, S Song… - proceedings of the …, 2021 - openaccess.thecvf.com
In this paper, we explore the spatial redundancy in video recognition with the aim to improve
the computational efficiency. It is observed that the most informative region in each frame of …

DS-Net++: Dynamic weight slicing for efficient inference in CNNs and vision transformers

C Li, G Wang, B Wang, X Liang, Z Li… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Dynamic networks have shown their promising capability in reducing theoretical
computation complexity by adapting their architectures to the input during inference …

Stcrowd: A multimodal dataset for pedestrian perception in crowded scenes

P Cong, X Zhu, F Qiao, Y Ren, X Peng… - Proceedings of the …, 2022 - openaccess.thecvf.com
Accurately detecting and tracking pedestrians in 3D space is challenging due to large
variations in rotations, poses and scales. The situation becomes even worse for dense …

Arm: Any-time super-resolution method

B Chen, M Lin, K Sheng, M Zhang, P Chen, K Li… - … on Computer Vision, 2022 - Springer
This paper proposes an Any-time super-Resolution Method (ARM) to tackle the over-
parameterized single image super-resolution (SISR) models. Our ARM is motivated by three …

Shuffle-invariant network for action recognition in videos

Q Shi, HB Zhang, Z Li, JX Du, Q Lei, JH Liu - ACM Transactions on …, 2022 - dl.acm.org
The local key features in video are important for improving the accuracy of human action
recognition. However, most end-to-end methods focus on global feature learning from …

Content-aware rectified activation for zero-shot fine-grained image retrieval

S Wang, J Chang, Z Wang, H Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Fine-grained image retrieval mainly focuses on learning salient features from the seen
subcategories as discriminative embedding while neglecting the problems behind zero-shot …

A large-scale study of spatiotemporal representation learning with a new benchmark on action recognition

A Deng, T Yang, C Chen - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
The goal of building a benchmark (suite of datasets) is to provide a unified protocol for fair
evaluation and thus facilitate the evolution of a specific area. Nonetheless, we point out that …

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

T Chen, H Yu, Z Yang, Z Li, W Sun… - Proceedings of the …, 2024 - openaccess.thecvf.com
Due to the resource-intensive nature of training vision-language models on expansive video
data a majority of studies have centered on adapting pre-trained image-language models to …