Making ai less" thirsty": Uncovering and addressing the secret water footprint of ai models
The growing carbon footprint of artificial intelligence (AI) models, especially large ones such
as GPT-3, has been undergoing public scrutiny. Unfortunately, however, the equally …
as GPT-3, has been undergoing public scrutiny. Unfortunately, however, the equally …
Drawing early-bird tickets: Towards more efficient training of deep networks
(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …
subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve …
Sparseloop: An analytical approach to sparse tensor accelerator modeling
In recent years, many accelerators have been proposed to efficiently process sparse tensor
algebra applications (eg, sparse neural networks). However, these proposals are single …
algebra applications (eg, sparse neural networks). However, these proposals are single …
Gpt4aigchip: Towards next-generation ai accelerator design automation via large language models
The remarkable capabilities and intricate nature of Artificial Intelligence (AI) have
dramatically escalated the imperative for specialized AI accelerators. Nonetheless …
dramatically escalated the imperative for specialized AI accelerators. Nonetheless …
Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …
processing these computational-and memory-intensive applications, tensors of these …
AutoDSE: Enabling software programmers to design efficient FPGA accelerators
Adopting FPGA as an accelerator in datacenters is becoming mainstream for customized
computing, but the fact that FPGAs are hard to program creates a steep learning curve for …
computing, but the fact that FPGAs are hard to program creates a steep learning curve for …
Edd: Efficient differentiable dnn architecture and implementation co-search for embedded ai solutions
High quality AI solutions require joint optimization of AI algorithms and their hardware
implementations. In this work, we are the first to propose a fully simultaneous, Efficient …
implementations. In this work, we are the first to propose a fully simultaneous, Efficient …
Hasco: Towards agile hardware and software co-design for tensor computation
Tensor computations overwhelm traditional general-purpose computing devices due to the
large amounts of data and operations of the computations. They call for a holistic solution …
large amounts of data and operations of the computations. They call for a holistic solution …
Timely: Pushing data movements and interfaces in pim accelerators towards local and in time domain
Resistive-random-access-memory (ReRAM) based processing-in-memory (R2PIM)
accelerators show promise in bridging the gap between Internet of Thing devices' …
accelerators show promise in bridging the gap between Internet of Thing devices' …
Review of neural network model acceleration techniques based on FPGA platforms
F Liu, H Li, W Hu, Y He - Neurocomputing, 2024 - Elsevier
Neural network models, celebrated for their outstanding scalability and computational
capabilities, have demonstrated remarkable performance across various fields such as …
capabilities, have demonstrated remarkable performance across various fields such as …