{DeepCPU}: Serving {RNN-based} Deep Learning Models 10x Faster

S Mittal, S Vaishay - Journal of Systems Architecture, 2019 - Elsevier

The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to
its unique features, the GPU continues to remain the most widely used accelerator for DL …

被引用次数：205 相关文章所有 5 个版本

Edge learning: The enabling technology for distributed big data analytics in the edge

J Zhang, Z Qu, C Chen, H Wang, Y Zhan, B Ye… - ACM Computing …, 2021 - dl.acm.org

Machine Learning (ML) has demonstrated great promise in various fields, eg, self-driving,
smart city, which are fundamentally altering the way individuals and organizations live, work …

被引用次数：50 相关文章所有 2 个版本

[PDF] usenix.org

Rammer: Enabling holistic deep learning compiler optimizations with {rTasks}

L Ma, Z Xie, Z Yang, J Xue, Y Miao, W Cui… - … USENIX Symposium on …, 2020 - usenix.org

Performing Deep Neural Network (DNN) computation on hardware accelerators efficiently is
challenging. Existing DNN frameworks and compilers often treat the DNN operators in a …

被引用次数：161 相关文章所有 7 个版本

[PDF] yshu.org

Flexible high-resolution object detection on edge devices with tunable latency

S Jiang, Z Lin, Y Li, Y Shu, Y Liu - Proceedings of the 27th Annual …, 2021 - dl.acm.org

Object detection is a fundamental building block of video analytics applications. While
Neural Networks (NNs)-based object detection models have shown excellent accuracy on …

被引用次数：93 相关文章所有 3 个版本

[PDF] usenix.org

Optimizing {CNN} model inference on {CPUs}

Y Liu, Y Wang, R Yu, M Li, V Sharma… - 2019 USENIX Annual …, 2019 - usenix.org

The popularity of Convolutional Neural Network (CNN) models and the ubiquity of CPUs
imply that better performance of CNN model inference on CPUs can deliver significant gain …

被引用次数：189 相关文章所有 11 个版本

[PDF] github.io

Enabling edge-cloud video analytics for robotics applications

Y Wang, W Wang, D Liu, X Jin, J Jiang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emerging deep learning-based video analytics tasks demand computation-intensive neural
networks and powerful computing resources on the cloud to achieve high accuracy. Due to …

被引用次数：66 相关文章所有 9 个版本

[PDF] usenix.org

{DVABatch}: Diversity-aware {Multi-Entry}{Multi-Exit} batching for efficient processing of {DNN} services on {GPUs}

W Cui, H Zhao, Q Chen, H Wei, Z Li, D Zeng… - 2022 USENIX Annual …, 2022 - usenix.org

The DNN inferences are often batched for better utilizing the hardware in existing DNN
serving systems. However, DNN serving exhibits diversity in many aspects, such as input …

被引用次数：43 相关文章所有 5 个版本

Asymo: scalable and efficient deep-learning inference on asymmetric mobile cpus

M Wang, S Ding, T Cao, Y Liu, F Xu - Proceedings of the 27th Annual …, 2021 - dl.acm.org

On-device deep learning (DL) inference has attracted vast interest. Mobile CPUs are the
most common hardware for on-device inference and many inference frameworks have been …

被引用次数：61 相关文章

[PDF] acm.org

DynaGraph: dynamic graph neural networks at scale

M Guan, AP Iyer, T Kim - Proceedings of the 5th ACM SIGMOD Joint …, 2022 - dl.acm.org

In this paper, we present DynaGraph, a system that supports dynamic Graph Neural
Networks (GNNs) efficiently. Based on the observation that existing proposals for dynamic …

被引用次数：34 相关文章所有 4 个版本

[PDF] researchgate.net

A survey of deep learning on cpus: opportunities and co-optimizations

S Mittal, P Rajput, S Subramoney - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

CPU is a powerful, pervasive, and indispensable platform for running deep learning (DL)
workloads in systems ranging from mobile to extreme-end servers. In this article, we present …

被引用次数：70 相关文章所有 6 个版本