关注
Jon Stearley
Jon Stearley
Faith Comes By Hearing
在 sandia.gov 的电子邮件经过验证
标题
引用次数
引用次数
年份
What supercomputers say: A study of five system logs
A Oliner, J Stearley
37th annual IEEE/IFIP international conference on dependable systems and …, 2007
6592007
Addressing failures in exascale computing
M Snir, RW Wisniewski, JA Abraham, SV Adve, S Bagchi, P Balaji, J Belak, ...
The International Journal of High Performance Computing Applications 28 (2 …, 2014
5202014
Memory errors in modern systems: The good, the bad, and the ugly
V Sridharan, N DeBardeleben, S Blanchard, KB Ferreira, J Stearley, ...
ACM SIGARCH Computer Architecture News 43 (1), 297-310, 2015
3742015
Evaluating the viability of process replication reliability for exascale systems
K Ferreira, J Stearley, JH Laros III, R Oldfield, K Pedretti, R Brightwell, ...
Proceedings of 2011 International Conference for High Performance Computing …, 2011
3302011
Feng shui of supercomputer memory: Positional effects in DRAM and SRAM faults
V Sridharan, J Stearley, N DeBardeleben, S Blanchard, S Gurumurthi
Proceedings of the International Conference on High Performance Computing …, 2013
2352013
Towards informatic analysis of syslogs
J Stearley
2004 IEEE International Conference on Cluster Computing (IEEE Cat. No …, 2004
1932004
Alert detection in system logs
AJ Oliner, A Aiken, J Stearley
2008 Eighth IEEE International Conference on Data Mining, 959-964, 2008
1312008
Bad words: Finding faults in spirit's syslogs
J Stearley, AJ Oliner
2008 Eighth IEEE International Symposium on Cluster Computing and the Grid …, 2008
902008
Bridging the gaps: Joining information sources with splunk
J Stearley, S Corwell, K Lord
Workshop on Managing Systems via Log Analysis and Machine Learning …, 2010
502010
Inter-agency workshop on hpc resilience at extreme scale
J Daly, B Harrod, T Hoang, L Nowell, B Adolf, S Borkar, N DeBardeleben, ...
National Security Agency Advanced Computing Systems, 2012
452012
Increasing fault resiliency in a message-passing environment
K Ferreira, R Riesen, R Oldfield, J Stearley, J Laros, K Pedretti, ...
Sandia National Laboratories, Technical report SAND2009-6753, 2009
382009
Redundant computing for exascale systems.
JR Stearley, RE Riesen, JH Laros III, KB Ferreira, KTT Pedretti, ...
Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA …, 2010
372010
Defining and measuring supercomputer Reliability, Availability, and Serviceability (RAS)
J Stearley
Proceedings of the Linux clusters institute conference, 2005
372005
Does partial replication pay off?
J Stearley, K Ferreira, D Robinson, J Laros, K Pedretti, D Arnold, ...
IEEE/IFIP International Conference on Dependable Systems and Networks …, 2012
352012
See applications run and throughput jump: The case for redundant computing in HPC
R Riesen, K Ferreira, J Stearley
2010 International Conference on Dependable Systems and Networks Workshops …, 2010
292010
rMPI: increasing fault resiliency in a message-passing environment
K Ferreira, R Riesen, R Oldfield, J Stearley, J Laros, K Pedretti, ...
Sandia National Laboratories, Albuquerque, NM, Tech. Rep. SAND2011-2488, 2011
252011
JHL III, R
K Ferreira, R Riesen, P Bridges, D Arnold, J Stearley
Oldfield, K. Pedretti, and R. Brightwell,“Evaluating the viability of …, 2011
232011
Extra bits on SRAM and DRAM errors–more data from the field
N DeBardeleben, S Blanchard, V Sridharan, S Gurumurthi, J Stearley, ...
IEEE Workshop on Silicon Errors in Logic-System Effects (SELSE), 2014
182014
Sisyphus log data mining toolkit
J Stearley
Accessed from the Web, 2009
132009
A {State-Machine} Approach to Disambiguating Supercomputer Event Logs
J Stearley, R Ballance, L Bauman
102012
系统目前无法执行此操作,请稍后再试。
文章 1–20