FPGA-based accelerators of deep learning networks for learning and classification: A review

A Shawahna, SM Sait, A El-Maleh - ieee Access, 2018 - ieeexplore.ieee.org
Due to recent advances in digital technologies, and availability of credible data, an area of
artificial intelligence, deep learning, has emerged and has demonstrated its ability and …

Scalable deep learning on distributed infrastructures: Challenges, techniques, and tools

R Mayer, HA Jacobsen - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-
art results in various domains, such as image recognition and natural language processing …

Neurosurgeon: Collaborative intelligence between the cloud and mobile edge

Y Kang, J Hauswald, C Gao, A Rovinski… - ACM SIGARCH …, 2017 - dl.acm.org
The computation for today's intelligent personal assistants such as Apple Siri, Google Now,
and Microsoft Cortana, is performed in the cloud. This cloud-only approach requires …

ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars

A Shafiee, A Nag, N Muralimanohar… - ACM SIGARCH …, 2016 - dl.acm.org
A number of recent efforts have attempted to design accelerators for popular machine
learning algorithms, such as those involving convolutional and deep neural networks (CNNs …

The architectural implications of autonomous driving: Constraints and acceleration

SC Lin, Y Zhang, CH Hsu, M Skach… - Proceedings of the …, 2018 - dl.acm.org
Autonomous driving systems have attracted a significant amount of interest recently, and
many industry leaders, such as Google, Uber, Tesla, and Mobileye, have invested a large …

Benchmarking TPU, GPU, and CPU platforms for deep learning

YE Wang, GY Wei, D Brooks - arXiv preprint arXiv:1907.10701, 2019 - arxiv.org
Training deep learning models is compute-intensive and there is an industry-wide trend
towards hardware specialization to improve performance. To systematically benchmark …

A cloud-scale acceleration architecture

AM Caulfield, ES Chung, A Putnam… - 2016 49th Annual …, 2016 - ieeexplore.ieee.org
Hyperscale datacenter providers have struggled to balance the growing need for
specialized hardware (efficiency) with the economic benefits of homogeneity …

Serving heterogeneous machine learning models on {Multi-GPU} servers with {Spatio-Temporal} sharing

S Choi, S Lee, Y Kim, J Park, Y Kwon… - 2022 USENIX Annual …, 2022 - usenix.org
As machine learning (ML) techniques are applied to a widening range of applications, high
throughput ML inference serving has become critical for online services. Such ML inference …

Fused-layer CNN accelerators

M Alwani, H Chen, M Ferdman… - 2016 49th Annual IEEE …, 2016 - ieeexplore.ieee.org
Deep convolutional neural networks (CNNs) are rapidly becoming the dominant approach to
computer vision and a major component of many other pervasive machine learning tasks …

From high-level deep neural models to FPGAs

H Sharma, J Park, D Mahajan, E Amaro… - 2016 49th Annual …, 2016 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) are compute-intensive learning models with growing
applicability in a wide range of domains. FPGAs are an attractive choice for DNNs since they …