Efficient acceleration of deep learning inference on resource-constrained edge devices: A review
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …
in breakthroughs in many areas. However, deploying these highly accurate models for data …
Hardware approximate techniques for deep neural network accelerators: A survey
Deep Neural Networks (DNNs) are very popular because of their high performance in
various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have …
various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have …
Training deep neural networks with 8-bit floating point numbers
The state-of-the-art hardware platforms for training deep neural networks are moving from
traditional single precision (32-bit) computations towards 16 bits of precision-in large part …
traditional single precision (32-bit) computations towards 16 bits of precision-in large part …
Ultra-low precision 4-bit training of deep neural networks
In this paper, we propose a number of novel techniques and numerical representation
formats that enable, for the very first time, the precision of training systems to be aggressively …
formats that enable, for the very first time, the precision of training systems to be aggressively …
In-memory computing: Advances and prospects
IMC has the potential to address a critical and foundational challenge affecting computing
platforms today-that is, the high energy and delay costs of moving data and accessing data …
platforms today-that is, the high energy and delay costs of moving data and accessing data …
[HTML][HTML] In-memory computing with emerging memory devices: Status and outlook
In-memory computing (IMC) has emerged as a new computing paradigm able to alleviate or
suppress the memory bottleneck, which is the major concern for energy efficiency and …
suppress the memory bottleneck, which is the major concern for energy efficiency and …
[图书][B] Approximate Computing
W Liu, F Lombardi - 2022 - Springer
Computing systems at all scales (from mobile handheld devices to supercomputers, servers,
and large cloud-based data centers) have seen significant performance gains, mostly …
and large cloud-based data centers) have seen significant performance gains, mostly …
Gobo: Quantizing attention-based nlp models for low latency and energy efficient inference
Attention-based models have demonstrated remarkable success in various natural
language understanding tasks. However, efficient execution remains a challenge for these …
language understanding tasks. However, efficient execution remains a challenge for these …
Accurate and efficient 2-bit quantized neural networks
J Choi, S Venkataramani… - Proceedings of …, 2019 - proceedings.mlsys.org
Deep learning algorithms achieve high classification accuracy at the expense of significant
computation cost. In order to reduce this cost, several quantization schemes have gained …
computation cost. In order to reduce this cost, several quantization schemes have gained …
A retrospective and prospective view of approximate computing [point of view
W Liu, F Lombardi, M Shulte - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Computing systems are conventionally designed to operate as accurately as possible.
However, this trend faces severe technology challenges, such as power consumption, circuit …
However, this trend faces severe technology challenges, such as power consumption, circuit …