FPGAs vs. CPUs: trends in peak floating-point performance K Underwood Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field …, 2004 | 460 | 2004 |
Evaluating NIC hardware requirements to achieve high message rate PGAS support on multi-core processors KD Underwood, MJ Levenhagen, R Brightwell Proceedings of the 2007 ACM/IEEE conference on Supercomputing, 1-10, 2007 | 409 | 2007 |
Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance KD Underwood, KS Hemmert 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines …, 2004 | 233 | 2004 |
Intel® Omni-path Architecture: Enabling Scalable, High Performance Fabrics MS Birrittella, M Debbage, R Huggahalli, J Kunz, T Lovett, T Rimmer, ... 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, 1-9, 2015 | 194 | 2015 |
A re-evaluation of the practicality of floating-point operations on FPGAs WB Ligon III, S McMillan, G Monn, K Schoonover, F Stivers, ... FPGAs for Custom Computing Machines, 1998. Proceedings. IEEE Symposium on …, 1998 | 182 | 1998 |
SeaStar interconnect: Balanced bandwidth for scalable performance R Brightwell, KT Pedretti, KD Underwood, T Hudson IEEE Micro 26 (3), 41-57, 2006 | 162 | 2006 |
Remote memory access programming in MPI-3 T Hoefler, J Dinan, R Thakur, B Barrett, P Balaji, W Gropp, K Underwood ACM Transactions on Parallel Computing (TOPC) 2 (2), 1-26, 2015 | 134 | 2015 |
A hardware acceleration unit for MPI queue processing KD Underwood, KS Hemmert, A Rodrigues, R Murphy, R Brightwell Parallel and Distributed Processing Symposium, 2005. Proceedings. 19th IEEE …, 2005 | 112 | 2005 |
Embedded floating-point units in FPGAs MJ Beauchamp, S Hauck, KD Underwood, KS Hemmert Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field …, 2006 | 111 | 2006 |
An analysis of the double-precision floating-point FFT on FPGAs KS Hemmert, KD Underwood 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines …, 2005 | 99 | 2005 |
RC-BLAST: Towards a portable, cost-effective open source hardware implementation K Muriki, KD Underwood, R Sass 19th IEEE International Parallel and Distributed Processing Symposium, 8 pp., 2005 | 97 | 2005 |
The impact of MPI queue usage on message latency KD Underwood, R Brightwell Parallel Processing, 2004. ICPP 2004. International Conference on, 152-160, 2004 | 85 | 2004 |
A comparison of floating point and logarithmic number systems for FPGAs M Haselman, M Beauchamp, A Wood, S Hauck, K Underwood, ... 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines …, 2005 | 81 | 2005 |
An analysis of NIC resource usage for offloading MPI R Brightwell, KD Underwood Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th …, 2004 | 79 | 2004 |
The Portals 4.0 network programming interface BW Barrett, R Brightwell, S Hemmert, K Pedretti, K Wheeler, K Underwood, ... Sandia National Laboratories, 2012 | 77* | 2012 |
Architectural modifications to enhance the floating-point performance of FPGAs MJ Beauchamp, S Hauck, KD Underwood, KS Hemmert IEEE Transactions on Very Large Scale Integration (VLSI) Systems 16 (2), 177-187, 2008 | 73 | 2008 |
An analysis of the impact of MPI overlap and independent progress R Brightwell, KD Underwood Proceedings of the 18th annual international conference on Supercomputing …, 2004 | 70 | 2004 |
Analyzing the impact of overlap, offload, and independent progress for message passing interface applications R Brightwell, R Riesen, KD Underwood The International Journal of High Performance Computing Applications 19 (2 …, 2005 | 63 | 2005 |
Initial performance evaluation of the Cray SeaStar interconnect R Brightwell, K Pedretti, KD Underwood 13th Symposium on High Performance Interconnects (HOTI'05), 51-57, 2005 | 61 | 2005 |
Mitigating MPI message matching misery M Flajslik, J Dinan, KD Underwood High Performance Computing: 31st International Conference, ISC High …, 2016 | 59 | 2016 |