Demystifying soft error assessment strategies on arm cpus: Microarchitectural fault injection vs. neutron beam experiments

A Chatzidimitriou, P Bodmann… - 2019 49th Annual …, 2019 - ieeexplore.ieee.org
Fault injection in early microarchitecture-level simulation CPU models and beam
experiments on the final physical CPU chip are two established methodologies to access the …

Soft error effects on arm microprocessors: Early estimations versus chip measurements

PR Bodmann, G Papadimitriou… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Extensive research efforts are being carried out to evaluate and improve the reliability of
computing devices either through beam experiments or simulation-based fault injection …

Impact of voltage scaling on soft errors susceptibility of multicore server cpus

D Agiakatsikas, G Papadimitriou, V Karakostas… - Proceedings of the 56th …, 2023 - dl.acm.org
Microprocessor power consumption and dependability are both crucial challenges that
designers have to cope with due to shrinking feature sizes and increasing transistor counts …

Syra: Early system reliability analysis for cross-layer soft errors resilience in memory arrays of microprocessor systems

A Vallero, A Savino, A Chatzidimitriou… - IEEE Transactions …, 2018 - ieeexplore.ieee.org
Cross-layer reliability is becoming the preferred solution when reliability is a concern in the
design of a microprocessor-based system. Nevertheless, deciding how to distribute the error …

SIFI: AMD southern islands GPU microarchitectural level fault injector

A Vallero, D Gizopoulos… - 2017 IEEE 23rd …, 2017 - ieeexplore.ieee.org
General Purpose computing on Graphics Processing Unit offers a remarkable speedup for
data parallel workloads, leveraging GPUs computational power. However, differently from …

A methodology for comparing the reliability of GPU-based and CPU-based HPCs

N Cini, G Yalcin - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Today, GPUs are widely used as coprocessors/accelerators in High-Performance
Heterogeneous Computing due to their many advantages. However, many researches …

Multi-faceted microarchitecture level reliability characterization for nvidia and amd gpus

A Vallero, S Tselonis, D Gizopoulos… - 2018 IEEE 36th VLSI …, 2018 - ieeexplore.ieee.org
State-of-the-art GPU chips are designed to deliver extreme throughput for graphics as well
as for data-parallel general purpose computing workloads (GPGPU computing). Unlike …

Can i/o variability be reduced on qos-less hpc storage systems?

D Huang, Q Liu, J Choi, N Podhorszki… - IEEE Transactions …, 2018 - ieeexplore.ieee.org
For a production high-performance computing (HPC) system, where storage devices are
shared between multiple applications and managed in a best effort manner, I/O contention is …

Accurate FIT rate estimation through high-level software fault injection

PR Bodmann, D Oliveira, P Rech - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Reliability is today one of the major issues for computing devices from the embedded
domain up to large high-performance systems. To safely deploy a computing device in a …

Microarchitecture level reliability comparison of modern gpu designs: First findings

A Vallero, S Di Carlo, S Tselonis… - … Analysis of Systems …, 2017 - ieeexplore.ieee.org
State-of-the-art GPU chips are designed to deliver extreme throughput for graphics as well
as for data-parallel general purpose computing workloads (GPGPU computing). Unlike …