Posits: the good, the bad and the ugly

F De Dinechin, L Forget, JM Muller… - Proceedings of the …, 2019 - dl.acm.org
Many properties of the IEEE-754 floating-point number system are taken for granted in
modern computers and are deeply embedded in compilers and low-level software routines …

Reconfigurable computing architectures

R Tessier, K Pocek, A DeHon - Proceedings of the IEEE, 2015 - ieeexplore.ieee.org
Reconfigurable architectures can bring unique capabilities to computational tasks. They
offer the performance and energy efficiency of hardware with the flexibility of software. In …

Blas comparison on fpga, cpu and gpu

S Kestur, JD Davis, O Williams - 2010 IEEE computer society …, 2010 - ieeexplore.ieee.org
High Performance Computing (HPC) or scientific codes are being executed across a wide
variety of computing platforms from embedded processors to massively parallel GPUs. We …

Ultralow-latency hardware-in-the-loop platform for rapid validation of power electronics designs

D Majstorovic, I Celanovic, ND Teslic… - IEEE Transactions …, 2011 - ieeexplore.ieee.org
This paper introduces a unified approach to the validation of power-electronics (PE) control
hardware, firmware, and software designs. It is based on a scalable application-specific …

Evaluating the hardware cost of the posit number system

Y Uguen, L Forget… - 2019 29th International …, 2019 - ieeexplore.ieee.org
The posit number system is proposed as a replacement of IEEE floating-point numbers. It is
a floating-point system that trades exponent bits for significand bits, depending on the …

Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs

S Aggarwal, HJ Damsgaard… - … Conference on Field …, 2024 - ieeexplore.ieee.org
Post-training quantization (PTQ) is a powerful technique for model compression, reducing
the numerical precision in neural networks without additional training overhead. Recent …

Generating high-performance custom floating-point pipelines

F De Dinechin, C Klein, B Pasca - … International Conference on …, 2009 - ieeexplore.ieee.org
Custom operators, working at custom precisions, are a key ingredient to fully exploit the
FPGA flexibility advantage for high-performance computing. Unfortunately, such operators …

A generator of numerically-tailored and high-throughput accelerators for batched GEMMs

L Ledoux, M Casas - 2022 IEEE 30th Annual International …, 2022 - ieeexplore.ieee.org
We propose a hardware generator of GEMM accelerators. Our generator produces vendor-
agnostic HDL describing highly customizable systolic arrays guided by accuracy and energy …

Parallel Accurate Minifloat MACCs for Neural Network Inference on Versal FPGAs

HJ Damsgaard, KJ Hoßfeld, J Nurmi… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Machine Learning (ML) is ubiquitous in contemporary applications. Its need for efficient
acceleration has driven vast research efforts into the quantization of neural networks with …

Design and implementation of Novel 32-bit MAC unit for DSP applications

HM Rakesh, GS Sunitha - 2020 International Conference for …, 2020 - ieeexplore.ieee.org
In today's smart and fast computing world, the designing of high speed and low energy
consumption based Digital Signal Processors (DSPs) is a realistic and ever embryonic area …