FPnew: An open-source multiformat floating-point unit architecture for energy-proportional transprecision computing

S Mach, F Schuiki, F Zaruba… - IEEE Transactions on Very …, 2020 - ieeexplore.ieee.org
The slowdown of Moore's law and the power wall necessitates a shift toward finely tunable
precision (aka transprecision) computing to reduce energy footprint. Hence, we need circuits …

Efficient multiple-precision floating-point fused multiply-add with mixed-precision support

H Zhang, D Chen, SB Ko - IEEE Transactions on Computers, 2019 - ieeexplore.ieee.org
In this paper, an efficient multiple-precision floating-point fused multiply-add (FMA) unit is
proposed. The proposed FMA supports not only single-precision, double-precision, and …

Precision-and accuracy-reconfigurable processor architectures—An overview

M Brand, F Hannig, O Keszocze… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
High performance and, at the same time, energy efficiency are important yet often conflicting
requirements in many fields of emerging applications. Those applications range from multi …

A configurable floating-point multiple-precision processing element for HPC and AI converged computing

W Mao, K Li, Q Cheng, L Dai, B Li, X Xie… - … Transactions on Very …, 2021 - ieeexplore.ieee.org
There is an emerging need to design configurable accelerators for the high-performance
computing (HPC) and artificial intelligence (AI) applications in different precisions. Thus, the …

Multiple-mode-supporting floating-point FMA unit for deep learning processors

H Tan, G Tong, L Huang, L Xiao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
In this article, a new multiple-mode floating-point fused multiply–add (FMA) unit is proposed
for deep learning processors. The proposed design supports three functional modes …

A vector systolic accelerator for multi-precision floating-point high-performance computing

K Li, W Mao, J Zhou, B Li, Z Yang… - … on Circuits and …, 2022 - ieeexplore.ieee.org
There is an emerging need to design multi-precision floating-point (FP) accelerators for high-
performance-computing (HPC) applications. The commonly-used methods are based on …

An energy-efficient mixed-bitwidth systolic accelerator for NAS-optimized deep neural networks

W Mao, L Dai, K Li, Q Cheng, Y Wang… - … Transactions on Very …, 2022 - ieeexplore.ieee.org
Optimized deep neural network (DNN) models and energy-efficient hardware designs are of
great importance in edge-computing applications. The neural architecture search (NAS) …

FAUST: design and implementation of a pipelined RISC-V vector floating-point unit

M Kovač, L Dragić, B Malnar, F Minervini… - Microprocessors and …, 2023 - Elsevier
In this paper, we present Faust, a pipelined FPU for vector processing-capable RISC-V core
developed within the European Processor Initiative (EPI) project. Faust is based on the open …

Efficient dual-precision floating-point fused-multiply-add architecture

V Arunachalam, ANJ Raj, N Hampannavar… - Microprocessors and …, 2018 - Elsevier
The fused-multiply-add (FMA) instruction is a common instruction in RISC processors since
1990. A 3-stage, 8-level pipelined, dual-precision FMA is proposed here that can perform …

All-Digital Computing-in-Memory Macro Supporting FP64-Based Fused Multiply-Add Operation

D Li, K Mo, L Liu, B Pan, W Li, W Kang, L Li - Applied Sciences, 2023 - mdpi.com
Recently, frequent data movement between computing units and memory during floating-
point arithmetic has become a major problem for scientific computing. Computing-in-memory …