An FPGA-specific approach to floating-point accumulation and sum-of-products

F De Dinechin, L Forget, JM Muller… - Proceedings of the …, 2019 - dl.acm.org

Many properties of the IEEE-754 floating-point number system are taken for granted in
modern computers and are deeply embedded in compilers and low-level software routines …

被引用次数：124 相关文章所有 13 个版本

[PDF] ieee.org

Reconfigurable computing architectures

R Tessier, K Pocek, A DeHon - Proceedings of the IEEE, 2015 - ieeexplore.ieee.org

Reconfigurable architectures can bring unique capabilities to computational tasks. They
offer the performance and energy efficiency of hardware with the flexibility of software. In …

被引用次数：283 相关文章所有 5 个版本

[PDF] caxapa.ru

Blas comparison on fpga, cpu and gpu

S Kestur, JD Davis, O Williams - 2010 IEEE computer society …, 2010 - ieeexplore.ieee.org

High Performance Computing (HPC) or scientific codes are being executed across a wide
variety of computing platforms from embedded processors to massively parallel GPUs. We …

被引用次数：264 相关文章所有 11 个版本

Ultralow-latency hardware-in-the-loop platform for rapid validation of power electronics designs

D Majstorovic, I Celanovic, ND Teslic… - IEEE Transactions …, 2011 - ieeexplore.ieee.org

This paper introduces a unified approach to the validation of power-electronics (PE) control
hardware, firmware, and software designs. It is based on a scalable application-specific …

被引用次数：162 相关文章所有 2 个版本

[PDF] hal.science

Evaluating the hardware cost of the posit number system

Y Uguen, L Forget… - 2019 29th International …, 2019 - ieeexplore.ieee.org

The posit number system is proposed as a replacement of IEEE floating-point numbers. It is
a floating-point system that trades exponent bits for significand bits, depending on the …

被引用次数：74 相关文章所有 6 个版本

[PDF] nus.edu.sg

Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs

S Aggarwal, HJ Damsgaard… - … Conference on Field …, 2024 - ieeexplore.ieee.org

Post-training quantization (PTQ) is a powerful technique for model compression, reducing
the numerical precision in neural networks without additional training overhead. Recent …

被引用次数：3 相关文章所有 4 个版本

[PDF] hal.science

Generating high-performance custom floating-point pipelines

F De Dinechin, C Klein, B Pasca - … International Conference on …, 2009 - ieeexplore.ieee.org

Custom operators, working at custom precisions, are a key ingredient to fully exploit the
FPGA flexibility advantage for high-performance computing. Unfortunately, such operators …

被引用次数：82 相关文章所有 12 个版本

[PDF] hal.science

A generator of numerically-tailored and high-throughput accelerators for batched GEMMs

L Ledoux, M Casas - 2022 IEEE 30th Annual International …, 2022 - ieeexplore.ieee.org

We propose a hardware generator of GEMM accelerators. Our generator produces vendor-
agnostic HDL describing highly customizable systolic arrays guided by accuracy and energy …

被引用次数：8 相关文章所有 4 个版本

[PDF] ieee.org

Parallel Accurate Minifloat MACCs for Neural Network Inference on Versal FPGAs

HJ Damsgaard, KJ Hoßfeld, J Nurmi… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Machine Learning (ML) is ubiquitous in contemporary applications. Its need for efficient
acceleration has driven vast research efforts into the quantization of neural networks with …

被引用次数：1 相关文章

Design and implementation of Novel 32-bit MAC unit for DSP applications

HM Rakesh, GS Sunitha - 2020 International Conference for …, 2020 - ieeexplore.ieee.org

In today's smart and fast computing world, the designing of high speed and low energy
consumption based Digital Signal Processors (DSPs) is a realistic and ever embryonic area …

被引用次数：17 相关文章