Addressing failures in exascale computing
M Snir, RW Wisniewski, JA Abraham… - … Journal of High …, 2014 - journals.sagepub.com
We present here a report produced by a workshop on 'Addressing failures in exascale
computing'held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to …
computing'held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to …
Accurate microarchitecture-level fault modeling for studying hardware faults
ML Li, P Ramachandran, UR Karpuzcu… - 2009 IEEE 15th …, 2009 - ieeexplore.ieee.org
Decreasing hardware reliability is expected to impede the exploitation of increasing
integration projected by Moore's Law. There is much ongoing research on efficient fault …
integration projected by Moore's Law. There is much ongoing research on efficient fault …
mSWAT: Low-cost hardware fault detection and diagnosis for multicore systems
SK Sastry Hari, ML Li, P Ramachandran… - Proceedings of the …, 2009 - dl.acm.org
Continued technology scaling is resulting in systems with billions of devices. Unfortunately,
these devices are prone to failures from various sources, resulting in even commodity …
these devices are prone to failures from various sources, resulting in even commodity …
Characterizing the impact of intermittent hardware faults on programs
L Rashid, K Pattabiraman… - IEEE Transactions on …, 2014 - ieeexplore.ieee.org
Extreme complimentary metal-oxide-semiconductor (CMOS) technology scaling is causing
significant concerns in the reliability of computer systems. Intermittent hardware errors are …
significant concerns in the reliability of computer systems. Intermittent hardware errors are …
Trace-based microarchitecture-level diagnosis of permanent hardware faults
ML Li, P Ramachandran, SK Sahoo… - … and Networks With …, 2008 - ieeexplore.ieee.org
As devices continue to scale, future shipped hardware will likely fail due to in-the-field
hardware faults. As traditional redundancy-based hardware reliability solutions that tackle …
hardware faults. As traditional redundancy-based hardware reliability solutions that tackle …
The use of microprocessor trace infrastructures for radiation-induced fault diagnosis
M Peña-Fernandez, A Lindoso… - … on Nuclear Science, 2019 - ieeexplore.ieee.org
This work proposes a methodology to diagnose radiation-induced faults in a microprocessor
using the hardware trace infrastructure. The diagnosis capabilities of this approach are …
using the hardware trace infrastructure. The diagnosis capabilities of this approach are …
Hardware/software codesign architecture for online testing in chip multiprocessors
O Khan, S Kundu - IEEE Transactions on Dependable and …, 2011 - ieeexplore.ieee.org
As the semiconductor industry continues its relentless push for nano-CMOS technologies,
long-term device reliability and occurrence of hard errors have emerged as a major concern …
long-term device reliability and occurrence of hard errors have emerged as a major concern …
Microprocessor error diagnosis by trace monitoring under laser testing
M Peña-Fernández, A Lindoso… - … on Nuclear Science, 2021 - ieeexplore.ieee.org
This work explores the diagnosis capabilities of the enriched information provided by
microprocessors trace subsystem combined with laser fault injection. Laser fault injection …
microprocessors trace subsystem combined with laser fault injection. Laser fault injection …
Hardware fault recovery for i/o intensive applications
P Ramachandran, SKS Hari, M Li… - ACM Transactions on …, 2014 - dl.acm.org
With continued process scaling, the rate of hardware failures in commodity systems is
increasing. Because these commodity systems are highly sensitive to cost, traditional …
increasing. Because these commodity systems are highly sensitive to cost, traditional …
[PDF][PDF] Techniques for Increasing Security and Reliability of IP Cores Embedded in FPGA and ASIC Designs
D Ziener - 2010 - research.utwente.nl
The focus of this work are faults and attacks in embedded systems, as well as methods to
cope with their associated overhead. This chapter gives a motivation for the topic of this …
cope with their associated overhead. This chapter gives a motivation for the topic of this …