Resiliency of automotive object detection networks on GPU architectures A Lotfi, S Hukerikar, K Balasubramanian, P Racunas, N Saxena, ... 2019 IEEE International Test Conference (ITC), 1-9, 2019 | 55 | 2019 |
Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale S Hukerikar, C Engelmann Supercomputing Frontiers and Innovations 4 (3), 1-38, 2017 | 52 | 2017 |
Characterizing and Mitigating Soft Errors in GPU DRAM MB Sullivan, N Saxena, M O'Connor, D Lee, P Racunas, S Hukerikar, ... MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021 | 39 | 2021 |
Big data meets hpc log analytics: Scalable approach to understanding systems at extreme scale BH Park, S Hukerikar, R Adamson, C Engelmann 2017 IEEE International Conference on Cluster Computing (CLUSTER), 758-765, 2017 | 37 | 2017 |
Shrink or substitute: Handling process failures in HPC systems using in-situ recovery RA Ashraf, S Hukerikar, C Engelmann 2018 26th Euromicro International Conference on Parallel, Distributed and …, 2018 | 30 | 2018 |
Opportunistic application-level fault detection through adaptive redundant multithreading S Hukerikar, PC Diniz, RF Lucas, K Teranishi 2014 International Conference on High Performance Computing & Simulation …, 2014 | 27 | 2014 |
Redthreads: An interface for application-level fault detection/correction through adaptive redundant multithreading S Hukerikar, K Teranishi, PC Diniz, RF Lucas International Journal of Parallel Programming, 1-27, 2016 | 21 | 2016 |
Resilience design patterns: A structured approach to resilience at extreme scale (version 1.2) S Hukerikar, C Engelmann Oak Ridge National Laboratory Technical Report, 2017 | 19 | 2017 |
Resilience Design Patterns - A Structured Approach to Resilience at Extreme Scale (version 1.0) S Hukerikar, C Engelmann https://ornlwiki.atlassian.net/wiki/download/attachments/72351753 …, 2016 | 19 | 2016 |
An evaluation of lazy fault detection based on adaptive redundant multithreading S Hukerikar, K Teranishi, PC Diniz, RF Lucas 2014 IEEE High Performance Extreme Computing Conference (HPEC), 1-6, 2014 | 19 | 2014 |
Resilience design patterns: A structured approach to resilience at extreme scale (version 1.1) S Hukerikar, C Engelmann Tech. Rep. ORNL/TM-2016/767, Oak Ridge National Laboratory, Oak Ridge, TN …, 0 | 19* | |
Rolex: Resilience-oriented language extensions for extreme-scale systems S Hukerikar, RF Lucas The Journal of Supercomputing 72 (12), 4662-4695, 2016 | 18 | 2016 |
A programming model for resilience in extreme scale computing S Hukerikar, PC Diniz, RF Lucas IEEE/IFIP International Conference on Dependable Systems and Networks …, 2012 | 16 | 2012 |
Pattern-based Modeling of Multiresilience Solutions for High-Performance Computing RA Ashraf, S Hukerikar, C Engelmann Proceedings of the 2018 ACM/SPEC International Conference on Performance …, 2018 | 9 | 2018 |
A Case for Adaptive Redundancy for HPC Resilience S Hukerikar, PC Diniz, RF Lucas European Conference on Parallel Processing, 690-697, 2013 | 9 | 2013 |
A pattern language for high-performance computing resilience S Hukerikar, C Engelmann Proceedings of the 22nd European Conference on Pattern Languages of Programs …, 2017 | 8 | 2017 |
Programming model extensions for resilience in extreme scale computing S Hukerikar, PC Diniz, RF Lucas European Conference on Parallel Processing, 496-498, 2012 | 6 | 2012 |
Towards New Metrics for High-Performance Computing Resilience S Hukerikar, RA Ashraf, C Engelmann Proceedings of the 2017 Workshop on Fault-Tolerance for HPC at Extreme Scale …, 2017 | 4 | 2017 |
Havens: Explicit Reliable Memory Regions for HPC Applications S Hukerikar, C Engelmann | 4* | |
Robust graph traversal: Resiliency techniques for data intensive supercomputing S Hukerikar, PC Diniz, RF Lucas High Performance Extreme Computing Conference (HPEC), 2013 IEEE, 1-6, 2013 | 3 | 2013 |