Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs

S Aggarwal, HJ Damsgaard… - … Conference on Field …, 2024 - ieeexplore.ieee.org
Post-training quantization (PTQ) is a powerful technique for model compression, reducing
the numerical precision in neural networks without additional training overhead. Recent …

Post-training quantization with low-precision minifloats and integers on FPGAs

S Aggarwal, A Pappalardo, HJ Damsgaard… - arXiv preprint arXiv …, 2023 - arxiv.org
Post-Training Quantization (PTQ) is a powerful technique for model compression, reducing
the precision of neural networks without additional training overhead. Recent works have …

A2Q+: Improving Accumulator-Aware Weight Quantization

I Colbert, A Pappalardo, J Petri-Koenig… - arXiv preprint arXiv …, 2024 - arxiv.org
Quantization techniques commonly reduce the inference costs of neural networks by
restricting the precision of weights and activations. Recent studies show that also reducing …

Accumulator-Aware Post-Training Quantization

I Colbert, F Grob, G Franco, J Zhang, R Saab - arXiv preprint arXiv …, 2024 - arxiv.org
Several recent studies have investigated low-precision accumulation, reporting
improvements in throughput, power, and area across various platforms. However, the …

CANET: Quantized Neural Network Inference With 8-bit Carry-Aware Accumulator

J Yang, X Wang, Y Jiang - IEEE Access, 2024 - ieeexplore.ieee.org
Neural network quantization represents weights and activations with few bits, greatly
reducing the overhead of multiplications. However, due to the recursive accumulation …

METHOD FOR DETERMINING THE BIT GRID OVERFLOW OF A COMPUTER SYSTEM OPERATING IN THE SYSTEM OF RESIDUAL CLASSES

AS Yanko, VA Krasnobayev, SB Nikolsky… - Radio Electronics …, 2024 - ric.zntu.edu.ua
METHOD FOR DETERMINING THE BIT GRID OVERFLOW OF A COMPUTER SYSTEM
OPERATING IN THE SYSTEM OF RESIDUAL CLASSES | Radio Electronics, Computer …