Sparse gpu kernels for deep learning
Scientific workloads have traditionally exploited high levels of sparsity to accelerate
computation and reduce memory requirements. While deep neural networks can be made …
computation and reduce memory requirements. While deep neural networks can be made …
Sparse reram engine: Joint exploration of activation and weight sparsity in compressed neural networks
Exploiting model sparsity to reduce ineffectual computation is a commonly used approach to
achieve energy efficiency for DNN inference accelerators. However, due to the tightly …
achieve energy efficiency for DNN inference accelerators. However, due to the tightly …
Efficient sparse-winograd convolutional neural networks
Convolutional Neural Networks (CNNs) are computationally intensive, which limits their
application on mobile devices. Their energy is dominated by the number of multiplies …
application on mobile devices. Their energy is dominated by the number of multiplies …
High performance CNN accelerators based on hardware and algorithm co-optimization
Convolutional neural networks (CNNs) have been widely used in image classification and
recognition due to their effectiveness; however, CNNs use a large volume of weight data that …
recognition due to their effectiveness; however, CNNs use a large volume of weight data that …
SpWA: An efficient sparse winograd convolutional neural networks accelerator on FPGAs
FPGAs have been an efficient accelerator for CNN inference due to its high performance,
flexibility, and energy-efficiency. To improve the performance of CNNs on FPGAs, fast …
flexibility, and energy-efficiency. To improve the performance of CNNs on FPGAs, fast …
Making convolutions resilient via algorithm-based error detection techniques
Convolutional Neural Networks (CNNs) are being increasingly used in safety-critical and
high-performance computing systems. As such systems require high levels of resilience to …
high-performance computing systems. As such systems require high levels of resilience to …
[PDF][PDF] Optimizing Selective Protection for CNN Resilience.
As CNNs are being extensively employed in high performance and safety-critical
applications that demand high reliability, it is important to ensure that they are resilient to …
applications that demand high reliability, it is important to ensure that they are resilient to …
WinoNN: Optimizing FPGA-based convolutional neural network accelerators using sparse Winograd algorithm
In recent years, a variety of accelerators on FPGAs have been proposed to speed up the
convolutional neural network (CNN) in many domain-specific application fields. Besides …
convolutional neural network (CNN) in many domain-specific application fields. Besides …
AUTO-PRUNE: Automated DNN pruning and mapping for ReRAM-based accelerator
Emergent ReRAM-based accelerators support in-memory computation to accelerate deep
neural network (DNN) inference. Weight matrix pruning of DNNs is a widely used technique …
neural network (DNN) inference. Weight matrix pruning of DNNs is a widely used technique …
Searching for fast model families on datacenter accelerators
Abstract Neural Architecture Search (NAS), together with model scaling, has shown
remarkable progress in designing high accuracy and fast convolutional architecture families …
remarkable progress in designing high accuracy and fast convolutional architecture families …