Domain-specific hardware accelerators

WJ Dally, Y Turakhia, S Han - Communications of the ACM, 2020 - dl.acm.org
Domain-specific hardware accelerators Page 1 48 COMMUNICATIONS OF THE ACM | JULY
2020 | VOL. 63 | NO. 7 contributed articles FROM THE SIMPLE embedded processor in your …

MachSuite: Benchmarks for accelerator design and customized architectures

B Reagen, R Adolf, YS Shao, GY Wei… - 2014 IEEE …, 2014 - ieeexplore.ieee.org
Recent high-level synthesis and accelerator-related architecture papers show a great
disparity in workload selection. To improve standardization within the accelerator research …

High-level synthesis design space exploration: Past, present, and future

BC Schafer, Z Wang - … on Computer-Aided Design of Integrated …, 2019 - ieeexplore.ieee.org
This article presents a survey of the different modern high-level synthesis (HLS) design
space exploration (DSE) techniques that have been proposed so far to automatically …

CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture

J Zhuang, J Lau, H Ye, Z Yang, Y Du, J Lo… - Proceedings of the …, 2023 - dl.acm.org
Dense matrix multiply (MM) serves as one of the most heavily used kernels in deep learning
applications. To cope with the high computation demands of these applications …

Democratizing domain-specific computing

Y Chi, W Qiao, A Sohrabizadeh, J Wang… - Communications of the …, 2022 - dl.acm.org
Democratizing Domain-Specific Computing Page 1 GENERAL-PURPOSE COMPUTERS
ARE widely used in our modern society. There were close to 24 million software …

Accelerator-rich architectures: Opportunities and progresses

J Cong, MA Ghodrat, M Gill, B Grigorian… - Proceedings of the 51st …, 2014 - dl.acm.org
To drastically improve energy efficiency, we believe future processors need to go beyond
parallelization and provide architecture support for customization, enabling systems to adapt …

A fully pipelined and dynamically composable architecture of CGRA

J Cong, H Huang, C Ma, B Xiao… - 2014 IEEE 22nd Annual …, 2014 - ieeexplore.ieee.org
Future processor chips will not be limited by the transistor resources, but will be mainly
constrained by energy efficiency. Reconfigurable fabrics bring higher energy efficiency than …

A ultra-low-energy convolution engine for fast brain-inspired vision in multicore clusters

F Conti, L Benini - 2015 Design, Automation & Test in Europe …, 2015 - ieeexplore.ieee.org
State-of-art brain-inspired computer vision algorithms such as Convolutional Neural
Networks (CNNs) are reaching accuracy and performance rivaling that of humans; however …

BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing

B Grigorian, N Farahpour… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
Applications with large amounts of data, real-time constraints, ultra-low power requirements,
and heavy computational complexity present significant challenges for modern computing …

Hybrid optimization/heuristic instruction scheduling for programmable accelerator codesign

T Nowatzki, N Ardalani, K Sankaralingam… - Proceedings of the 27th …, 2018 - dl.acm.org
Recent programmable accelerators are faster and more energy efficient than general
purpose processors, but expose complex hardware/software abstractions for compilers. A …