Understanding the propagation of hard errors to software and implications for resilient system design

ML Li, P Ramachandran, SK Sahoo, SV Adve… - ACM Sigplan …, 2008 - dl.acm.org
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-
the-field faults. To be broadly deployable, the hardware reliability solution must incur low …

Parameter variation tolerance and error resiliency: New design paradigm for the nanoscale era

S Ghosh, K Roy - Proceedings of the IEEE, 2010 - ieeexplore.ieee.org
Variations in process parameters affect the operation of integrated circuits (ICs) and pose a
significant threat to the continued scaling of transistor dimensions. Such parameter …

Exploiting structural duplication for lifetime reliability enhancement

J Srinivasan, SV Adve, P Bose… - … Architecture (ISCA'05), 2005 - ieeexplore.ieee.org
Increased power densities (and resultant temperatures) and other effects of device scaling
are predicted to cause significant lifetime reliability problems in the near future. In this paper …

Configurable isolation: building high availability systems with commodity multi-core processors

N Aggarwal, P Ranganathan, NP Jouppi… - ACM SIGARCH …, 2007 - dl.acm.org
High availability is an increasingly important requirement for enterprise systems, often
valued more than performance. Systems designed for high availability typically use …

BulletProof: A defect-tolerant CMP switch architecture

K Constantinides, S Plaza, J Blome… - … Symposium on High …, 2006 - ieeexplore.ieee.org
As silicon technologies move into the nanometer regime, transistor reliability is expected to
wane as devices become subject to extreme process variation, particle-induced transient …

Scalable thread scheduling and global power management for heterogeneous many-core architectures

JA Winter, DH Albonesi, CA Shoemaker - Proceedings of the 19th …, 2010 - dl.acm.org
Future many-core microprocessors are likely to be heterogeneous, by design or due to
variability and defects. The latter type of heterogeneity is especially challenging due to its …

Web portal functionality and state government e-service

JP Gant, DB Gant - Proceedings of the 35th Annual Hawaii …, 2002 - ieeexplore.ieee.org
This paper reports the results of a study investigating the role of Web portals in state
government electronic service delivery. We describe the functionality of the fifty US state …

Architectures for online error detection and recovery in multicore processors

D Gizopoulos, M Psarakis, SV Adve… - … , Automation & Test …, 2011 - ieeexplore.ieee.org
The huge investment in the design and production of multicore processors may be put at risk
because the emerging highly miniaturized but unreliable fabrication technologies will …

Flicker: A dynamically adaptive architecture for power limited multicore systems

P Petrica, AM Izraelevitz, DH Albonesi… - Proceedings of the 40th …, 2013 - dl.acm.org
Future microprocessors may become so power constrained that not all transistors will be
able to be powered on at once. These systems will be required to nimbly adapt to changes …

Architectural core salvaging in a multi-core processor for hard-error tolerance

MD Powell, A Biswas, S Gupta… - ACM SIGARCH Computer …, 2009 - dl.acm.org
The incidence of hard errors in CPUs is a challenge for future multicore designs due to
increasing total core area. Even if the location and nature of hard errors are known a priori …