Posits: the good, the bad and the ugly
F De Dinechin, L Forget, JM Muller… - Proceedings of the …, 2019 - dl.acm.org
Many properties of the IEEE-754 floating-point number system are taken for granted in
modern computers and are deeply embedded in compilers and low-level software routines …
modern computers and are deeply embedded in compilers and low-level software routines …
Reconfigurable computing architectures
Reconfigurable architectures can bring unique capabilities to computational tasks. They
offer the performance and energy efficiency of hardware with the flexibility of software. In …
offer the performance and energy efficiency of hardware with the flexibility of software. In …
Blas comparison on fpga, cpu and gpu
S Kestur, JD Davis, O Williams - 2010 IEEE computer society …, 2010 - ieeexplore.ieee.org
High Performance Computing (HPC) or scientific codes are being executed across a wide
variety of computing platforms from embedded processors to massively parallel GPUs. We …
variety of computing platforms from embedded processors to massively parallel GPUs. We …
Ultralow-latency hardware-in-the-loop platform for rapid validation of power electronics designs
D Majstorovic, I Celanovic, ND Teslic… - IEEE Transactions …, 2011 - ieeexplore.ieee.org
This paper introduces a unified approach to the validation of power-electronics (PE) control
hardware, firmware, and software designs. It is based on a scalable application-specific …
hardware, firmware, and software designs. It is based on a scalable application-specific …
Evaluating the hardware cost of the posit number system
Y Uguen, L Forget… - 2019 29th International …, 2019 - ieeexplore.ieee.org
The posit number system is proposed as a replacement of IEEE floating-point numbers. It is
a floating-point system that trades exponent bits for significand bits, depending on the …
a floating-point system that trades exponent bits for significand bits, depending on the …
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
S Aggarwal, HJ Damsgaard… - … Conference on Field …, 2024 - ieeexplore.ieee.org
Post-training quantization (PTQ) is a powerful technique for model compression, reducing
the numerical precision in neural networks without additional training overhead. Recent …
the numerical precision in neural networks without additional training overhead. Recent …
Generating high-performance custom floating-point pipelines
Custom operators, working at custom precisions, are a key ingredient to fully exploit the
FPGA flexibility advantage for high-performance computing. Unfortunately, such operators …
FPGA flexibility advantage for high-performance computing. Unfortunately, such operators …
A generator of numerically-tailored and high-throughput accelerators for batched GEMMs
We propose a hardware generator of GEMM accelerators. Our generator produces vendor-
agnostic HDL describing highly customizable systolic arrays guided by accuracy and energy …
agnostic HDL describing highly customizable systolic arrays guided by accuracy and energy …
Parallel Accurate Minifloat MACCs for Neural Network Inference on Versal FPGAs
HJ Damsgaard, KJ Hoßfeld, J Nurmi… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Machine Learning (ML) is ubiquitous in contemporary applications. Its need for efficient
acceleration has driven vast research efforts into the quantization of neural networks with …
acceleration has driven vast research efforts into the quantization of neural networks with …
Design and implementation of Novel 32-bit MAC unit for DSP applications
HM Rakesh, GS Sunitha - 2020 International Conference for …, 2020 - ieeexplore.ieee.org
In today's smart and fast computing world, the designing of high speed and low energy
consumption based Digital Signal Processors (DSPs) is a realistic and ever embryonic area …
consumption based Digital Signal Processors (DSPs) is a realistic and ever embryonic area …