An fpga-based transformer accelerator using output block stationary dataflow for object recognition applications

Z Zhao, R Cao, KF Un, WH Yu, PI Mak… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
The transformer-based model has great potential to deliver higher accuracy for object
recognition applications when comparing it with the convolution neural network (CNN). Yet …

Hardware-friendly logarithmic quantization with mixed-precision for mobilenetv2

D Choi, H Kim - 2022 IEEE 4th international conference on …, 2022 - ieeexplore.ieee.org
In a variety of computer vision applications, convolutional neural networks (CNNs) have
achieved excellent accuracy. However, in order for a CNN to operate on embedded …

FxP-QNet: a post-training quantizer for the design of mixed low-precision DNNs with dynamic fixed-point representation

A Shawahna, SM Sait, A El-Maleh, I Ahmad - IEEE Access, 2022 - ieeexplore.ieee.org
Deep neural networks (DNNs) have demonstrated their effectiveness in a wide range of
computer vision tasks, with the state-of-the-art results obtained through complex and deep …

DoubleQExt: Hardware and memory efficient CNN through two levels of quantization

JC See, HF Ng, HK Tan, JJ Chang, WK Lee… - IEEE …, 2021 - ieeexplore.ieee.org
To fulfil the tight area and memory constraints in IoT applications, the design of efficient
Convolutional Neural Network (CNN) hardware becomes crucial. Quantization of CNN is …

Energy-efficient high-speed ASIC implementation of convolutional neural network using novel reduced critical-path design

SS Lee, TD Nguyen, PK Meher, SY Park - IEEE Access, 2022 - ieeexplore.ieee.org
Convolutional Neural Network (CNN) plays an important role in several machine learning
tasks related to speech, image, and video processing applications. The increasing demand …

An Energy-Efficient Edge Processor for Radar-Based Continuous Fall Detection Utilizing Mixed-Radix FFT and Updated Block-Wise Computation

J Chen, K Lin, L Yang, W Ye - IEEE Internet of Things Journal, 2024 - ieeexplore.ieee.org
In the scenarios of the Internet of Things, fall detection holds increasing significance in the
health monitoring of elderly individuals. While most current research has achieved …

Design and implementation of an efficient CNN accelerator for low-cost FPGAs

Y Xu, S Wang, N Li, H Xiao - IEICE Electronics Express, 2022 - jstage.jst.go.jp
This paper proposes a computation-array-centered dataflow, which adjusts the convolution
with different kernel sizes to a unified computing manner and reduces the dimension of …

[PDF][PDF] CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses.

HW Son, AA Al-Hamid, YS Na, DY Lee… - Computers, Materials & …, 2023 - researchgate.net
This paper presents the architecture of a Convolution Neural Network (CNN) accelerator
based on a new processing element (PE) array called a diagonal cyclic array (DCA). As …

ASLog: An Area-Efficient CNN Accelerator for Per-Channel Logarithmic Post-Training Quantization

J Xu, J Fan, B Nan, C Ding, LR Zheng… - … on Circuits and …, 2023 - ieeexplore.ieee.org
Post-training quantization (PTQ) has been proven an efficient model compression technique
for Convolution Neural Networks (CNNs), without re-training or access to labeled datasets …

HPIPE-NX: Leveraging tensor blocks for high-performance CNN inference acceleration on FPGAs

MO Stan - 2022 - search.proquest.com
This thesis enhances the state-of-the-art CNN inference FPGA accelerator, HPIPE, to take
advantage of the tensor blocks on the AI-optimized Stratix 10 NX FPGA. We first further …