Sparse gpu kernels for deep learning

T Gale, M Zaharia, C Young… - … Conference for High …, 2020 - ieeexplore.ieee.org
Scientific workloads have traditionally exploited high levels of sparsity to accelerate
computation and reduce memory requirements. While deep neural networks can be made …

Sparse reram engine: Joint exploration of activation and weight sparsity in compressed neural networks

TH Yang, HY Cheng, CL Yang, IC Tseng… - Proceedings of the 46th …, 2019 - dl.acm.org
Exploiting model sparsity to reduce ineffectual computation is a commonly used approach to
achieve energy efficiency for DNN inference accelerators. However, due to the tightly …

Efficient sparse-winograd convolutional neural networks

X Liu, J Pool, S Han, WJ Dally - arXiv preprint arXiv:1802.06367, 2018 - arxiv.org
Convolutional Neural Networks (CNNs) are computationally intensive, which limits their
application on mobile devices. Their energy is dominated by the number of multiplies …

High performance CNN accelerators based on hardware and algorithm co-optimization

T Yuan, W Liu, J Han, F Lombardi - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) have been widely used in image classification and
recognition due to their effectiveness; however, CNNs use a large volume of weight data that …

SpWA: An efficient sparse winograd convolutional neural networks accelerator on FPGAs

L Lu, Y Liang - Proceedings of the 55th Annual Design Automation …, 2018 - dl.acm.org
FPGAs have been an efficient accelerator for CNN inference due to its high performance,
flexibility, and energy-efficiency. To improve the performance of CNNs on FPGAs, fast …

Making convolutions resilient via algorithm-based error detection techniques

SKS Hari, MB Sullivan, T Tsai… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Convolutional Neural Networks (CNNs) are being increasingly used in safety-critical and
high-performance computing systems. As such systems require high levels of resilience to …

[PDF][PDF] Optimizing Selective Protection for CNN Resilience.

A Mahmoud, SKS Hari, CW Fletcher, SV Adve, C Sakr… - ISSRE, 2021 - ma3mool.github.io
As CNNs are being extensively employed in high performance and safety-critical
applications that demand high reliability, it is important to ensure that they are resilient to …

WinoNN: Optimizing FPGA-based convolutional neural network accelerators using sparse Winograd algorithm

X Wang, C Wang, J Cao, L Gong… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
In recent years, a variety of accelerators on FPGAs have been proposed to speed up the
convolutional neural network (CNN) in many domain-specific application fields. Besides …

AUTO-PRUNE: Automated DNN pruning and mapping for ReRAM-based accelerator

S Yang, W Chen, X Zhang, S He, Y Yin… - Proceedings of the ACM …, 2021 - dl.acm.org
Emergent ReRAM-based accelerators support in-memory computation to accelerate deep
neural network (DNN) inference. Weight matrix pruning of DNNs is a widely used technique …

Searching for fast model families on datacenter accelerators

S Li, M Tan, R Pang, A Li, L Cheng… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Neural Architecture Search (NAS), together with model scaling, has shown
remarkable progress in designing high accuracy and fast convolutional architecture families …