Making ai less" thirsty": Uncovering and addressing the secret water footprint of ai models

P Li, J Yang, MA Islam, S Ren - arXiv preprint arXiv:2304.03271, 2023 - arxiv.org
The growing carbon footprint of artificial intelligence (AI) models, especially large ones such
as GPT-3, has been undergoing public scrutiny. Unfortunately, however, the equally …

Drawing early-bird tickets: Towards more efficient training of deep networks

H You, C Li, P Xu, Y Fu, Y Wang, X Chen… - arXiv preprint arXiv …, 2019 - arxiv.org
(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …

Sparseloop: An analytical approach to sparse tensor accelerator modeling

YN Wu, PA Tsai, A Parashar, V Sze… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
In recent years, many accelerators have been proposed to efficiently process sparse tensor
algebra applications (eg, sparse neural networks). However, these proposals are single …

Gpt4aigchip: Towards next-generation ai accelerator design automation via large language models

Y Fu, Y Zhang, Z Yu, S Li, Z Ye, C Li… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org
The remarkable capabilities and intricate nature of Artificial Intelligence (AI) have
dramatically escalated the imperative for specialized AI accelerators. Nonetheless …

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

AutoDSE: Enabling software programmers to design efficient FPGA accelerators

A Sohrabizadeh, CH Yu, M Gao, J Cong - ACM Transactions on Design …, 2022 - dl.acm.org
Adopting FPGA as an accelerator in datacenters is becoming mainstream for customized
computing, but the fact that FPGAs are hard to program creates a steep learning curve for …

Edd: Efficient differentiable dnn architecture and implementation co-search for embedded ai solutions

Y Li, C Hao, X Zhang, X Liu, Y Chen… - 2020 57th ACM/IEEE …, 2020 - ieeexplore.ieee.org
High quality AI solutions require joint optimization of AI algorithms and their hardware
implementations. In this work, we are the first to propose a fully simultaneous, Efficient …

Hasco: Towards agile hardware and software co-design for tensor computation

Q Xiao, S Zheng, B Wu, P Xu, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Tensor computations overwhelm traditional general-purpose computing devices due to the
large amounts of data and operations of the computations. They call for a holistic solution …

Timely: Pushing data movements and interfaces in pim accelerators towards local and in time domain

W Li, P Xu, Y Zhao, H Li, Y Xie… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Resistive-random-access-memory (ReRAM) based processing-in-memory (R2PIM)
accelerators show promise in bridging the gap between Internet of Thing devices' …

Review of neural network model acceleration techniques based on FPGA platforms

F Liu, H Li, W Hu, Y He - Neurocomputing, 2024 - Elsevier
Neural network models, celebrated for their outstanding scalability and computational
capabilities, have demonstrated remarkable performance across various fields such as …