[HTML][HTML] Applications and techniques for fast machine learning in science

AMC Deiana, N Tran, J Agar, M Blott… - Frontiers in big …, 2022 - frontiersin.org
In this community review report, we discuss applications and techniques for fast machine
learning (ML) in science—the concept of integrating powerful ML methods into the real-time …

Programming and synthesis for software-defined FPGA acceleration: status and future prospects

YH Lai, E Ustun, S Xiang, Z Fang, H Rong… - ACM Transactions on …, 2021 - dl.acm.org
FPGA-based accelerators are increasingly popular across a broad range of applications,
because they offer massive parallelism, high energy efficiency, and great flexibility for …

A compiler infrastructure for accelerator generators

R Nigam, S Thomas, Z Li, A Sampson - Proceedings of the 26th ACM …, 2021 - dl.acm.org
We present Calyx, a new intermediate language (IL) for compiling high-level programs into
hardware designs. Calyx combines a hardware-like structural language with a software-like …

Modular hardware design with timeline types

R Nigam, PH Azevedo de Amorim… - Proceedings of the ACM …, 2023 - dl.acm.org
Modular design is a key challenge for enabling large-scale reuse of hardware modules.
Unlike software, however, hardware designs correspond to physical circuits and inherit …

Archytas: A framework for synthesizing and dynamically optimizing accelerators for robotic localization

W Liu, B Yu, Y Gan, Q Liu, J Tang, S Liu… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
Despite many recent efforts, accelerating robotic computing is still fundamentally
challenging for two reasons. First, robotics software stack is extremely complicated …

Aha: An agile approach to the design of coarse-grained reconfigurable accelerators and compilers

K Koul, J Melchert, K Sreedhar, L Truong… - ACM Transactions on …, 2023 - dl.acm.org
With the slowing of Moore's law, computer architects have turned to domain-specific
hardware specialization to continue improving the performance and efficiency of computing …

Allo: A Programming Model for Composable Accelerator Design

H Chen, N Zhang, S Xiang, Z Zeng, M Dai… - Proceedings of the ACM …, 2024 - dl.acm.org
Special-purpose hardware accelerators are increasingly pivotal for sustaining performance
improvements in emerging applications, especially as the benefits of technology scaling …

Unified buffer: Compiling image processing and machine learning applications to push-memory accelerators

Q Liu, J Setter, D Huff, M Strange, K Feng… - ACM Transactions on …, 2023 - dl.acm.org
Image processing and machine learning applications benefit tremendously from hardware
acceleration. Existing compilers target either FPGAs, which sacrifice power and performance …

HECTOR: A multi-level intermediate representation for hardware synthesis methodologies

R Xu, Y Xiao, J Luo, Y Liang - Proceedings of the 41st IEEE/ACM …, 2022 - dl.acm.org
Hardware synthesis requires a complicated process to generate synthesizable register
transfer level (RTL) code. High-level synthesis tools can automatically transform a high-level …

HIDA: A Hierarchical Dataflow Compiler for High-Level Synthesis

H Ye, H Jun, D Chen - Proceedings of the 29th ACM International …, 2024 - dl.acm.org
Dataflow architectures are growing in popularity due to their potential to mitigate the
challenges posed by the memory wall inherent to the Von Neumann architecture. At the …