Domain-specific hardware accelerators
Domain-specific hardware accelerators Page 1 48 COMMUNICATIONS OF THE ACM | JULY
2020 | VOL. 63 | NO. 7 contributed articles FROM THE SIMPLE embedded processor in your …
2020 | VOL. 63 | NO. 7 contributed articles FROM THE SIMPLE embedded processor in your …
MachSuite: Benchmarks for accelerator design and customized architectures
Recent high-level synthesis and accelerator-related architecture papers show a great
disparity in workload selection. To improve standardization within the accelerator research …
disparity in workload selection. To improve standardization within the accelerator research …
High-level synthesis design space exploration: Past, present, and future
BC Schafer, Z Wang - … on Computer-Aided Design of Integrated …, 2019 - ieeexplore.ieee.org
This article presents a survey of the different modern high-level synthesis (HLS) design
space exploration (DSE) techniques that have been proposed so far to automatically …
space exploration (DSE) techniques that have been proposed so far to automatically …
CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture
Dense matrix multiply (MM) serves as one of the most heavily used kernels in deep learning
applications. To cope with the high computation demands of these applications …
applications. To cope with the high computation demands of these applications …
Democratizing domain-specific computing
Democratizing Domain-Specific Computing Page 1 GENERAL-PURPOSE COMPUTERS
ARE widely used in our modern society. There were close to 24 million software …
ARE widely used in our modern society. There were close to 24 million software …
Accelerator-rich architectures: Opportunities and progresses
To drastically improve energy efficiency, we believe future processors need to go beyond
parallelization and provide architecture support for customization, enabling systems to adapt …
parallelization and provide architecture support for customization, enabling systems to adapt …
A fully pipelined and dynamically composable architecture of CGRA
Future processor chips will not be limited by the transistor resources, but will be mainly
constrained by energy efficiency. Reconfigurable fabrics bring higher energy efficiency than …
constrained by energy efficiency. Reconfigurable fabrics bring higher energy efficiency than …
A ultra-low-energy convolution engine for fast brain-inspired vision in multicore clusters
State-of-art brain-inspired computer vision algorithms such as Convolutional Neural
Networks (CNNs) are reaching accuracy and performance rivaling that of humans; however …
Networks (CNNs) are reaching accuracy and performance rivaling that of humans; however …
BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing
B Grigorian, N Farahpour… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
Applications with large amounts of data, real-time constraints, ultra-low power requirements,
and heavy computational complexity present significant challenges for modern computing …
and heavy computational complexity present significant challenges for modern computing …
Hybrid optimization/heuristic instruction scheduling for programmable accelerator codesign
Recent programmable accelerators are faster and more energy efficient than general
purpose processors, but expose complex hardware/software abstractions for compilers. A …
purpose processors, but expose complex hardware/software abstractions for compilers. A …