A survey of accelerator architectures for deep neural networks

Y Chen, Y Xie, L Song, F Chen, T Tang - Engineering, 2020 - Elsevier
Recently, due to the availability of big data and the rapid growth of computing power,
artificial intelligence (AI) has regained tremendous attention and investment. Machine …

A survey on deep neural network compression: Challenges, overview, and solutions

R Mishra, HP Gupta, T Dutta - arXiv preprint arXiv:2010.03954, 2020 - arxiv.org
Deep Neural Network (DNN) has gained unprecedented performance due to its automated
feature extraction capability. This high order performance leads to significant incorporation …

ELSA: Hardware-software co-design for efficient, lightweight self-attention mechanism in neural networks

TJ Ham, Y Lee, SH Seo, S Kim, H Choi… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
The self-attention mechanism is rapidly emerging as one of the most important key primitives
in neural networks (NNs) for its ability to identify the relations within input entities. The self …

Nonuniform-to-uniform quantization: Towards accurate quantization via generalized straight-through estimation

Z Liu, KT Cheng, D Huang… - Proceedings of the …, 2022 - openaccess.thecvf.com
The nonuniform quantization strategy for compressing neural networks usually achieves
better performance than its counterpart, ie, uniform strategy, due to its superior …

Dota: detect and omit weak attentions for scalable transformer acceleration

Z Qu, L Liu, F Tu, Z Chen, Y Ding, Y Xie - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Transformer Neural Networks have demonstrated leading performance in many applications
spanning over language understanding, image processing, and generative modeling …

Transforming large-size to lightweight deep neural networks for IoT applications

R Mishra, H Gupta - ACM Computing Surveys, 2023 - dl.acm.org
Deep Neural Networks (DNNs) have gained unprecedented popularity due to their high-
order performance and automated feature extraction capability. This has encouraged …

Machine learning in real-time Internet of Things (IoT) systems: A survey

J Bian, A Al Arafat, H Xiong, J Li, L Li… - IEEE Internet of …, 2022 - ieeexplore.ieee.org
Over the last decade, machine learning (ML) and deep learning (DL) algorithms have
significantly evolved and been employed in diverse applications, such as computer vision …

[HTML][HTML] AI augmented Edge and Fog computing: Trends and challenges

S Tuli, F Mirhakimi, S Pallewatta, S Zawad… - Journal of Network and …, 2023 - Elsevier
In recent years, the landscape of computing paradigms has witnessed a gradual yet
remarkable shift from monolithic computing to distributed and decentralized paradigms such …

Efficient AI system design with cross-layer approximate computing

S Venkataramani, X Sun, N Wang… - Proceedings of the …, 2020 - ieeexplore.ieee.org
Advances in deep neural networks (DNNs) and the availability of massive real-world data
have enabled superhuman levels of accuracy on many AI tasks and ushered the explosive …

{ALERT}: Accurate learning for energy and timeliness

C Wan, M Santriaji, E Rogers, H Hoffmann… - 2020 USENIX annual …, 2020 - usenix.org
An increasing number of software applications incorporate runtime Deep Neural Networks
(DNNs) to process sensor data and return inference results to humans. Effective deployment …