Adaptable approximate multiplier design based on input distribution and polarity

Z Li, S Zheng, J Zhang, Y Lu, J Gao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Z Li, S Zheng, J Zhang, Y Lu, J Gao, J Tao, L Wang
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2022ieeexplore.ieee.org
Approximate computing is an efficient approach to reduce the design complexity for error-
resilient applications. Multipliers are key arithmetic units in many applications, such as deep
neural networks (DNNs) and digital signal processing (DSP) systems. In this article, an open-
source adaptable approximate multiplier design driven by input distribution and polarity is
proposed to generate optimized approximate multipliers to trade off between the application-
level performance and the hardware cost. The proposed method minimizes the average …
Approximate computing is an efficient approach to reduce the design complexity for error-resilient applications. Multipliers are key arithmetic units in many applications, such as deep neural networks (DNNs) and digital signal processing (DSP) systems. In this article, an open-source adaptable approximate multiplier design driven by input distribution and polarity is proposed to generate optimized approximate multipliers to trade off between the application-level performance and the hardware cost. The proposed method minimizes the average square of the absolute error of an approximate multiplier according to the probability distributions of operands extracted from the target application with consideration of input polarity, achieving low hardware cost and negligible application-level performance loss. The proposed method can generate unsigned multipliers (or signed multipliers) based on the Braun multiplier (or Baugh–Wooley multiplier). To demonstrate the effectiveness of the method, three different-scale quantized DNNs, including LeNet, AlexNet, and VGG16 with 8 8 unsigned multiplication and an adaptive least mean square (LMS)-based finite impulse response (FIR) filter with 16 16 fixed-point signed multiplication, are evaluated. In the DNN training process, a noise training technique is adopted to reduce the accuracy loss due to the approximation. When compared to the state-of-the-art approximate multipliers, the generated multipliers can achieve up to 26.4% and 27.1% product of power, delay, and area gains with negligible application-level performance loss in VGG16 and FIR applications, respectively.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果