SMAUG: End-to-end full-stack simulation infrastructure for deep learning workloads

S Xi, Y Yao, K Bhardwaj, P Whatmough… - ACM Transactions on …, 2020 - dl.acm.org
In recent years, there has been tremendous advances in hardware acceleration of deep
neural networks. However, most of the research has focused on optimizing accelerator …

gem5-salam: A system architecture for llvm-based accelerator modeling

S Rogers, J Slycord, M Baharani… - 2020 53rd Annual IEEE …, 2020 - ieeexplore.ieee.org
With the prevalence of hardware accelerators as an integral part of the modern systems on
chip (SoCs), the ability to quickly and accurately model accelerators within the system it …

Bayesian optimization for efficient accelerator synthesis

A Mehrabi, A Manocha, BC Lee, DJ Sorin - ACM Transactions on …, 2020 - dl.acm.org
Accelerator design is expensive due to the effort required to understand an algorithm and
optimize the design. Architects have embraced two technologies to reduce costs. High-level …

Gem5+ rtl: A framework to enable rtl models inside a full-system simulator

G López-Paradís, A Armejach, M Moretó - Proceedings of the 50th …, 2021 - dl.acm.org
In recent years there has been a surge of interest in designing custom accelerators for
power-efficient high-performance computing. However, available tools to simulate low-level …

Expanding hardware accelerator system design space exploration with gem5-SALAMv2

Z Spencer, S Rogers, J Slycord, H Tabkhi - Journal of Systems Architecture, 2024 - Elsevier
With the prevalence of hardware accelerators as an integral part of the modern systems on
chip (SoCs), the ability to model accelerators quickly and accurately within the system in …

Prof5: A risc-v profiler tool

J Silveira, L Castro, V Araújo, R Zeli… - 2022 IEEE 34th …, 2022 - ieeexplore.ieee.org
RISC-V is supported by a series of design and simulation tools that enable simple instruction
set customization and rapid exploration of application-specific accelerators. Evaluating the …

gem5-NVDLA: A Simulation Framework for Compiling, Scheduling and Architecture Evaluation on AI System-on-Chips

C Lai, W Zhang - ACM Transactions on Design Automation of Electronic …, 2024 - dl.acm.org
Recent years have seen an increasing trend in designing AI accelerators together with the
rest of the system, including CPUs and memory hierarchy. This trend calls for high-quality …

Energy efficient mapping on manycore with dynamic and partial reconfiguration: Application to a smart camera

R Bonamy, S Bilavarn, F Muller… - … Journal of Circuit …, 2018 - Wiley Online Library
This paper describes a methodology to improve the energy efficiency of high‐performance
multiprocessor architectures with dynamic and partial reconfiguration (DPR), based on a …

ERAS: A Flexible and Scalable Framework for Seamless Integration of RTL Models with Structural Simulation Toolkit

S Nema, SK Chunduru, C Kodigal… - 2023 IEEE …, 2023 - ieeexplore.ieee.org
The prevalence of custom Intellectual Properties (IPs) poses challenges for assessing their
system-level performance and functional behavior. Register Transfer Level (RTL) simulation …

A combined fast/cycle accurate simulation tool for reconfigurable accelerator evaluation: application to distributed data management

E Lenormand, T Goubier, L Cudennec… - … Workshop on Rapid …, 2020 - ieeexplore.ieee.org
Parallel computing systems based on reconfigurable accelerators are becoming (1)
increasingly heterogeneous,(2) difficult to design and (3) complex to model. Such modeling …