AutoDNNchip: An automated DNN chip predictor and builder for both FPGAs and ASICs

P Li, J Yang, MA Islam, S Ren - arXiv preprint arXiv:2304.03271, 2023 - arxiv.org

The growing carbon footprint of artificial intelligence (AI) models, especially large ones such
as GPT-3, has been undergoing public scrutiny. Unfortunately, however, the equally …

被引用次数：144 相关文章所有 7 个版本

[PDF] arxiv.org

Drawing early-bird tickets: Towards more efficient training of deep networks

H You, C Li, P Xu, Y Fu, Y Wang, X Chen… - arXiv preprint arXiv …, 2019 - arxiv.org

(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …

被引用次数：284 相关文章所有 7 个版本

[PDF] arxiv.org

Sparseloop: An analytical approach to sparse tensor accelerator modeling

YN Wu, PA Tsai, A Parashar, V Sze… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

In recent years, many accelerators have been proposed to efficiently process sparse tensor
algebra applications (eg, sparse neural networks). However, these proposals are single …

被引用次数：60 相关文章所有 10 个版本

[PDF] arxiv.org

Gpt4aigchip: Towards next-generation ai accelerator design automation via large language models

Y Fu, Y Zhang, Z Yu, S Li, Z Ye, C Li… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org

The remarkable capabilities and intricate nature of Artificial Intelligence (AI) have
dramatically escalated the imperative for specialized AI accelerators. Nonetheless …

被引用次数：59 相关文章所有 4 个版本

[PDF] ieee.org

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org

Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

被引用次数：98 相关文章所有 7 个版本

[PDF] acm.org

AutoDSE: Enabling software programmers to design efficient FPGA accelerators

A Sohrabizadeh, CH Yu, M Gao, J Cong - ACM Transactions on Design …, 2022 - dl.acm.org

Adopting FPGA as an accelerator in datacenters is becoming mainstream for customized
computing, but the fact that FPGAs are hard to program creates a steep learning curve for …

被引用次数：96 相关文章所有 9 个版本

[PDF] arxiv.org

Edd: Efficient differentiable dnn architecture and implementation co-search for embedded ai solutions

Y Li, C Hao, X Zhang, X Liu, Y Chen… - 2020 57th ACM/IEEE …, 2020 - ieeexplore.ieee.org

High quality AI solutions require joint optimization of AI algorithms and their hardware
implementations. In this work, we are the first to propose a fully simultaneous, Efficient …

被引用次数：111 相关文章所有 8 个版本

[PDF] arxiv.org

Hasco: Towards agile hardware and software co-design for tensor computation

Q Xiao, S Zheng, B Wu, P Xu, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Tensor computations overwhelm traditional general-purpose computing devices due to the
large amounts of data and operations of the computations. They call for a holistic solution …

被引用次数：71 相关文章所有 10 个版本

[PDF] arxiv.org

Timely: Pushing data movements and interfaces in pim accelerators towards local and in time domain

W Li, P Xu, Y Zhao, H Li, Y Xie… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org

Resistive-random-access-memory (ReRAM) based processing-in-memory (R2PIM)
accelerators show promise in bridging the gap between Internet of Thing devices' …

被引用次数：87 相关文章所有 8 个版本

Review of neural network model acceleration techniques based on FPGA platforms

F Liu, H Li, W Hu, Y He - Neurocomputing, 2024 - Elsevier

Neural network models, celebrated for their outstanding scalability and computational
capabilities, have demonstrated remarkable performance across various fields such as …

被引用次数：1 相关文章