A review of the optimal design of neural networks based on FPGA
C Wang, Z Luo - Applied Sciences, 2022 - mdpi.com
Deep learning based on neural networks has been widely used in image recognition,
speech recognition, natural language processing, automatic driving, and other fields and …
speech recognition, natural language processing, automatic driving, and other fields and …
Matraptor: A sparse-sparse matrix multiplication accelerator based on row-wise product
Sparse-sparse matrix multiplication (SpGEMM) is a computation kernel widely used in
numerous application domains such as data analytics, graph processing, and scientific …
numerous application domains such as data analytics, graph processing, and scientific …
EIE: Efficient inference engine on compressed deep neural network
State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and
are both computationally and memory intensive, making them difficult to deploy on …
are both computationally and memory intensive, making them difficult to deploy on …
Ese: Efficient speech recognition engine with sparse lstm on fpga
Long Short-Term Memory (LSTM) is widely used in speech recognition. In order to achieve
higher prediction accuracy, machine learning scientists have built increasingly larger …
higher prediction accuracy, machine learning scientists have built increasingly larger …
Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity
Neural networks based on Long Short-Term Memory (LSTM) are widely deployed in latency-
sensitive language and speech applications. To speed up LSTM inference, previous …
sensitive language and speech applications. To speed up LSTM inference, previous …
Tensaurus: A versatile accelerator for mixed sparse-dense tensor computations
N Srivastava, H Jin, S Smith, H Rong… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Tensor factorizations are powerful tools in many machine learning and data analytics
applications. Tensors are often sparse, which makes sparse tensor factorizations memory …
applications. Tensors are often sparse, which makes sparse tensor factorizations memory …
DeltaRNN: A power-efficient recurrent neural network accelerator
Recurrent Neural Networks (RNNs) are widely used in speech recognition and natural
language processing applications because of their capability to process temporal …
language processing applications because of their capability to process temporal …
Tabla: A unified template-based framework for accelerating statistical machine learning
A growing number of commercial and enterprise systems increasingly rely on compute-
intensive Machine Learning (ML) algorithms. While the demand for these compute-intensive …
intensive Machine Learning (ML) algorithms. While the demand for these compute-intensive …
GraphLily: Accelerating graph linear algebra on HBM-equipped FPGAs
Graph processing is typically memory bound due to low compute to memory access ratio
and irregular data access pattern. The emerging high-bandwidth memory (HBM) delivers …
and irregular data access pattern. The emerging high-bandwidth memory (HBM) delivers …
Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …