Efficient deep learning: A survey on making deep learning models smaller, faster, and better
G Menghani - ACM Computing Surveys, 2023 - dl.acm.org
Deep learning has revolutionized the fields of computer vision, natural language
understanding, speech recognition, information retrieval, and more. However, with the …
understanding, speech recognition, information retrieval, and more. However, with the …
Knowledge graphs
In this article, we provide a comprehensive introduction to knowledge graphs, which have
recently garnered significant attention from both industry and academia in scenarios that …
recently garnered significant attention from both industry and academia in scenarios that …
Photonic multiply-accumulate operations for neural networks
It has long been known that photonic communication can alleviate the data movement
bottlenecks that plague conventional microelectronic processors. More recently, there has …
bottlenecks that plague conventional microelectronic processors. More recently, there has …
Benchmarking TPU, GPU, and CPU platforms for deep learning
Training deep learning models is compute-intensive and there is an industry-wide trend
towards hardware specialization to improve performance. To systematically benchmark …
towards hardware specialization to improve performance. To systematically benchmark …
Sparch: Efficient architecture for sparse matrix multiplication
Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous task in various
engineering and scientific applications. However, inner product based SpGEMM introduces …
engineering and scientific applications. However, inner product based SpGEMM introduces …
Deep learning with limited numerical precision
Training of large-scale deep neural networks is often constrained by the available
computational resources. We study the effect of limited precision data representation and …
computational resources. We study the effect of limited precision data representation and …
Hardware implementation of deep network accelerators towards healthcare and biomedical applications
The advent of dedicated Deep Learning (DL) accelerators and neuromorphic processors
has brought on new opportunities for applying both Deep and Spiking Neural Network …
has brought on new opportunities for applying both Deep and Spiking Neural Network …
GenASM: A high-performance, low-power approximate string matching acceleration framework for genome sequence analysis
Genome sequence analysis has enabled significant advancements in medical and scientific
areas such as personalized medicine, outbreak tracing, and the understanding of evolution …
areas such as personalized medicine, outbreak tracing, and the understanding of evolution …
HERMES-Core—A 1.59-TOPS/mm2 PCM on 14-nm CMOS In-Memory Compute Core Using 300-ps/LSB Linearized CCO-Based ADCs
R Khaddam-Aljameh, M Stanisavljevic… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org
We present a 256 256 in-memory compute (IMC) core designed and fabricated in 14-nm
CMOS technology with backend-integrated multi-level phase change memory (PCM). It …
CMOS technology with backend-integrated multi-level phase change memory (PCM). It …
Packing sparse convolutional neural networks for efficient systolic array implementations: Column combining under joint optimization
This paper describes a novel approach of packing sparse convolutional neural networks into
a denser format for efficient implementations using systolic arrays. By combining multiple …
a denser format for efficient implementations using systolic arrays. By combining multiple …