From cnn to dnn hardware accelerators: A survey on design, exploration, simulation, and frameworks

LR Juracy, R Garibotti, FG Moraes - Foundations and Trends® …, 2023 - nowpublishers.com
Over the past decade, a massive proliferation of machine learning algorithms has emerged,
from applications for surveillance to self-driving cars. The turning point occurred with the …

A fast, accurate, and comprehensive PPA estimation of convolutional hardware accelerators

LR Juracy, A de Morais Amory… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Convolutional Neural Networks (CNN) are widely adopted for Machine Learning (ML) tasks,
such as classification and computer vision. GPUs became the reference platforms for both …

A high-level modeling framework for estimating hardware metrics of CNN accelerators

LR Juracy, MT Moreira… - … on Circuits and …, 2021 - ieeexplore.ieee.org
GPUs became the reference platform for both training and inference phases of
Convolutional Neural Networks (CNN) due to their tailored architecture to the CNN …

CNNParted: An open source framework for efficient Convolutional Neural Network inference partitioning in embedded systems

F Kreß, V Sidorenko, P Schmidt, J Hoefer, T Hotfilter… - Computer Networks, 2023 - Elsevier
Applications such as autonomous driving or assistive robotics heavily rely on the usage of
Deep Neural Networks. In particular, Convolutional Neural Networks (CNNs) provide …

RISC-V virtual platform-based convolutional neural network accelerator implemented in SystemC

SH Lim, WSW Suh, JY Kim, SY Cho - Electronics, 2021 - mdpi.com
The optimization for hardware processor and system for performing deep learning
operations such as Convolutional Neural Networks (CNN) in resource limited embedded …

Optimization of communication schemes for DMA-controlled accelerators

J Wang, S Park, CS Park - IEEE Access, 2021 - ieeexplore.ieee.org
The hardware accelerator controlled by direct memory access (DMA) is greatly influenced by
the communication bandwidth from/to DRAM through on-chip buses. This paper proposes a …

System-level communication performance estimation for DMA-controlled accelerators

S Kim, S Park, CS Park - IEEE Access, 2021 - ieeexplore.ieee.org
The performance of a hardware accelerator is often limited by the communication bandwidth
between local on-chip memories and DRAM across on-chip bus. In this paper, a system …

Spatial data dependence graph based pre-rtl simulator for convolutional neural network dataflows

J Wang, S Park, CS Park - IEEE Access, 2022 - ieeexplore.ieee.org
In this paper, a new pre-RTL simulator is proposed to predict the power, performance, and
area of convolutional neural network (CNN) dataflows prior to register-transfer-level (RTL) …

Optimization of multi-core accelerator performance based on accurate performance estimation

S Kim, Y Seo, S Park, CS Park - IEEE Access, 2022 - ieeexplore.ieee.org
Multicore accelerators have emerged to efficiently execute recent applications with complex
computational dimensions. Compared to a single-core accelerator, a multicore accelerator …

A framework for fast architecture exploration of convolutional neural network accelerators

LR Juracy - 2022 - tede2.pucrs.br
Machine Learning (ML) is a sub-area of artificial intelligence comprehending algorithms to
solve classification and pattern recognition problems. One of the most common ways to …