An fpga-based transformer accelerator using output block stationary dataflow for object recognition applications
The transformer-based model has great potential to deliver higher accuracy for object
recognition applications when comparing it with the convolution neural network (CNN). Yet …
recognition applications when comparing it with the convolution neural network (CNN). Yet …
Hardware-friendly logarithmic quantization with mixed-precision for mobilenetv2
In a variety of computer vision applications, convolutional neural networks (CNNs) have
achieved excellent accuracy. However, in order for a CNN to operate on embedded …
achieved excellent accuracy. However, in order for a CNN to operate on embedded …
FxP-QNet: a post-training quantizer for the design of mixed low-precision DNNs with dynamic fixed-point representation
Deep neural networks (DNNs) have demonstrated their effectiveness in a wide range of
computer vision tasks, with the state-of-the-art results obtained through complex and deep …
computer vision tasks, with the state-of-the-art results obtained through complex and deep …
DoubleQExt: Hardware and memory efficient CNN through two levels of quantization
To fulfil the tight area and memory constraints in IoT applications, the design of efficient
Convolutional Neural Network (CNN) hardware becomes crucial. Quantization of CNN is …
Convolutional Neural Network (CNN) hardware becomes crucial. Quantization of CNN is …
Energy-efficient high-speed ASIC implementation of convolutional neural network using novel reduced critical-path design
Convolutional Neural Network (CNN) plays an important role in several machine learning
tasks related to speech, image, and video processing applications. The increasing demand …
tasks related to speech, image, and video processing applications. The increasing demand …
An Energy-Efficient Edge Processor for Radar-Based Continuous Fall Detection Utilizing Mixed-Radix FFT and Updated Block-Wise Computation
J Chen, K Lin, L Yang, W Ye - IEEE Internet of Things Journal, 2024 - ieeexplore.ieee.org
In the scenarios of the Internet of Things, fall detection holds increasing significance in the
health monitoring of elderly individuals. While most current research has achieved …
health monitoring of elderly individuals. While most current research has achieved …
Design and implementation of an efficient CNN accelerator for low-cost FPGAs
Y Xu, S Wang, N Li, H Xiao - IEICE Electronics Express, 2022 - jstage.jst.go.jp
This paper proposes a computation-array-centered dataflow, which adjusts the convolution
with different kernel sizes to a unified computing manner and reduces the dimension of …
with different kernel sizes to a unified computing manner and reduces the dimension of …
[PDF][PDF] CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses.
HW Son, AA Al-Hamid, YS Na, DY Lee… - Computers, Materials & …, 2023 - researchgate.net
This paper presents the architecture of a Convolution Neural Network (CNN) accelerator
based on a new processing element (PE) array called a diagonal cyclic array (DCA). As …
based on a new processing element (PE) array called a diagonal cyclic array (DCA). As …
ASLog: An Area-Efficient CNN Accelerator for Per-Channel Logarithmic Post-Training Quantization
Post-training quantization (PTQ) has been proven an efficient model compression technique
for Convolution Neural Networks (CNNs), without re-training or access to labeled datasets …
for Convolution Neural Networks (CNNs), without re-training or access to labeled datasets …
HPIPE-NX: Leveraging tensor blocks for high-performance CNN inference acceleration on FPGAs
MO Stan - 2022 - search.proquest.com
This thesis enhances the state-of-the-art CNN inference FPGA accelerator, HPIPE, to take
advantage of the tensor blocks on the AI-optimized Stratix 10 NX FPGA. We first further …
advantage of the tensor blocks on the AI-optimized Stratix 10 NX FPGA. We first further …