SMAUG: End-to-end full-stack simulation infrastructure for deep learning workloads
In recent years, there has been tremendous advances in hardware acceleration of deep
neural networks. However, most of the research has focused on optimizing accelerator …
neural networks. However, most of the research has focused on optimizing accelerator …
gem5-salam: A system architecture for llvm-based accelerator modeling
S Rogers, J Slycord, M Baharani… - 2020 53rd Annual IEEE …, 2020 - ieeexplore.ieee.org
With the prevalence of hardware accelerators as an integral part of the modern systems on
chip (SoCs), the ability to quickly and accurately model accelerators within the system it …
chip (SoCs), the ability to quickly and accurately model accelerators within the system it …
Bayesian optimization for efficient accelerator synthesis
Accelerator design is expensive due to the effort required to understand an algorithm and
optimize the design. Architects have embraced two technologies to reduce costs. High-level …
optimize the design. Architects have embraced two technologies to reduce costs. High-level …
Gem5+ rtl: A framework to enable rtl models inside a full-system simulator
In recent years there has been a surge of interest in designing custom accelerators for
power-efficient high-performance computing. However, available tools to simulate low-level …
power-efficient high-performance computing. However, available tools to simulate low-level …
Expanding hardware accelerator system design space exploration with gem5-SALAMv2
With the prevalence of hardware accelerators as an integral part of the modern systems on
chip (SoCs), the ability to model accelerators quickly and accurately within the system in …
chip (SoCs), the ability to model accelerators quickly and accurately within the system in …
Prof5: A risc-v profiler tool
J Silveira, L Castro, V Araújo, R Zeli… - 2022 IEEE 34th …, 2022 - ieeexplore.ieee.org
RISC-V is supported by a series of design and simulation tools that enable simple instruction
set customization and rapid exploration of application-specific accelerators. Evaluating the …
set customization and rapid exploration of application-specific accelerators. Evaluating the …
gem5-NVDLA: A Simulation Framework for Compiling, Scheduling and Architecture Evaluation on AI System-on-Chips
C Lai, W Zhang - ACM Transactions on Design Automation of Electronic …, 2024 - dl.acm.org
Recent years have seen an increasing trend in designing AI accelerators together with the
rest of the system, including CPUs and memory hierarchy. This trend calls for high-quality …
rest of the system, including CPUs and memory hierarchy. This trend calls for high-quality …
Energy efficient mapping on manycore with dynamic and partial reconfiguration: Application to a smart camera
R Bonamy, S Bilavarn, F Muller… - … Journal of Circuit …, 2018 - Wiley Online Library
This paper describes a methodology to improve the energy efficiency of high‐performance
multiprocessor architectures with dynamic and partial reconfiguration (DPR), based on a …
multiprocessor architectures with dynamic and partial reconfiguration (DPR), based on a …
ERAS: A Flexible and Scalable Framework for Seamless Integration of RTL Models with Structural Simulation Toolkit
S Nema, SK Chunduru, C Kodigal… - 2023 IEEE …, 2023 - ieeexplore.ieee.org
The prevalence of custom Intellectual Properties (IPs) poses challenges for assessing their
system-level performance and functional behavior. Register Transfer Level (RTL) simulation …
system-level performance and functional behavior. Register Transfer Level (RTL) simulation …
A combined fast/cycle accurate simulation tool for reconfigurable accelerator evaluation: application to distributed data management
Parallel computing systems based on reconfigurable accelerators are becoming (1)
increasingly heterogeneous,(2) difficult to design and (3) complex to model. Such modeling …
increasingly heterogeneous,(2) difficult to design and (3) complex to model. Such modeling …