FracBNN: Accurate and FPGA-efficient binary neural networks with fractional activations
Binary neural networks (BNNs) have 1-bit weights and activations. Such networks are well
suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory …
suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory …
A survey on the optimization of neural network accelerators for micro-ai on-device inference
Deep neural networks (DNNs) are being prototyped for a variety of artificial intelligence (AI)
tasks including computer vision, data analytics, robotics, etc. The efficacy of DNNs coincides …
tasks including computer vision, data analytics, robotics, etc. The efficacy of DNNs coincides …
Enabling design methodologies and future trends for edge AI: Specialization and codesign
This work is an introduction and a survey for the Special Issue on Machine Intelligence at the
Edge. The authors argue that workloads that were formerly performed in the cloud are …
Edge. The authors argue that workloads that were formerly performed in the cloud are …
VecQ: Minimal loss DNN model compression with vectorized weight quantization
Quantization has been proven to be an effective method for reducing the computing and/or
storage cost of DNNs. However, the trade-off between the quantization bitwidth and final …
storage cost of DNNs. However, the trade-off between the quantization bitwidth and final …
Fpga-based deep learning inference accelerators: Where are we standing?
Recently, artificial intelligence applications have become part of almost all emerging
technologies around us. Neural networks, in particular, have shown significant advantages …
technologies around us. Neural networks, in particular, have shown significant advantages …
Tas: ternarized neural architecture search for resource-constrained edge devices
Ternary Neural Networks (TNNs) compress network weights and activation functions into 2-
bit representation resulting in remarkable network compression and energy efficiency …
bit representation resulting in remarkable network compression and energy efficiency …
[HTML][HTML] Benchmarking edge computing devices for grape bunches and trunks detection using accelerated object detection single shot multibox deep learning models
Purpose: Visual perception enables robots to perceive the environment. Visual data is
processed using computer vision algorithms that are usually time-expensive and require …
processed using computer vision algorithms that are usually time-expensive and require …
WinoCNN: Kernel sharing Winograd systolic array for efficient convolutional neural network acceleration on FPGAs
The combination of Winograd's algorithm and systolic array architecture has demonstrated
the capability of improving DSP efficiency in accelerating convolutional neural networks …
the capability of improving DSP efficiency in accelerating convolutional neural networks …
Algorithm/Accelerator co-design and co-search for edge AI
The world has seen the great success of deep neural networks (DNNs) in a massive number
of artificial intelligence (AI) applications. However, developing high-quality AI services to …
of artificial intelligence (AI) applications. However, developing high-quality AI services to …
Qs-nas: Optimally quantized scaled architecture search to enable efficient on-device micro-ai
M Hosseini, T Mohsenin - … on Emerging and Selected Topics in …, 2021 - ieeexplore.ieee.org
Because of their simple hardware requirements, low bitwidth neural networks (NN) have
gained significant attention over the recent years, and have been extensively employed in …
gained significant attention over the recent years, and have been extensively employed in …