High performance parallelism pearls volume two: multicore and many-core programming approaches J Jeffers, J Reinders Morgan Kaufmann, 2015 | 151 | 2015 |
A unified framework for optimizing locality, parallelism, and communication in out-of-core computations M Kandemir, A Choudhary, J Ramanujam, MA Kandaswamy IEEE Transactions on Parallel and Distributed Systems 11 (7), 648-668, 2000 | 38 | 2000 |
A prefetching prototype for the parallel file systems on the Paragon M Arunachalam, A Choudhary ACM SIGMETRICS Performance Evaluation Review 23 (1), 321-322, 1995 | 23 | 1995 |
Method to assess energy efficiency of HPC system operated with and without power constraints D Bodas, M Arunachalam, I Sharapov, CR Yount, SB Huck, R Huggahalli, ... US Patent 9,971,391, 2018 | 22 | 2018 |
I/O phase characterization of TPC-H query operations MA Kandaswamy, RL Knighten Proceedings IEEE International Computer Performance and Dependability …, 2000 | 17 | 2000 |
Implementation and evaluation of prefetching in the Intel Paragon parallel file system M Arunachalam, A Choudhary, B Rullman Proceedings of International Conference on Parallel Processing, 554-559, 1996 | 17 | 1996 |
An experimental evaluation of I/O optimizations on different applications MA Kandaswamy, M Kandemir, A Choudhary, D Bernholdt IEEE Transactions on Parallel and Distributed Systems 13 (12), 1303-1319, 2002 | 14 | 2002 |
Performance implications of architectural and software techniques on I/O-intensive applications MA Kandaswamy, M Kandemir, A Choudhary, DE Bernholdt Proceedings. 1998 International Conference on Parallel Processing (Cat. No …, 1998 | 14 | 1998 |
Optimizing cpu performance for recommendation systems at-scale R Jain, S Cheng, V Kalagi, V Sanghavi, S Kaul, M Arunachalam, K Maeng, ... Proceedings of the 50th Annual International Symposium on Computer …, 2023 | 12 | 2023 |
A unified compiler algorithm for optimizing locality, parallelism and communication in out-of-core computations M Kandemir, A Choudhary, J Ramanujam, M Kandaswamy Proceedings of the fifth workshop on I/O in parallel and distributed systems …, 1997 | 10 | 1997 |
Performance and energy evaluation of data prefetching on intel Xeon Phi D Guttman, MT Kandemir, M Arunachalamy, V Calina 2015 IEEE International Symposium on Performance Analysis of Systems and …, 2015 | 9 | 2015 |
Locality optimization algorithms for compilation of out-of-core codes M Kandemir, A Choudhary, J Ramanujam, M Kandaswamy J. Inf. Sci. Eng. 14 (1), 107-138, 1998 | 7 | 1998 |
A learning-guided hierarchical approach for biomedical image segmentation H Jiang, A Sarma, J Ryoo, JB Kotra, M Arunachalam, CR Das, ... 2018 31st IEEE International System-on-Chip Conference (SOCC), 227-232, 2018 | 6 | 2018 |
Optimization and evaluation of Hartree-Fock application's I/O with PASSION MA Kandaswamy, MT Kandemir, AN Choudhary, DE Bernholdt Proceedings of the 1997 ACM/IEEE conference on Supercomputing, 1-20, 1997 | 6 | 1997 |
Implementation and evaluation of collective i/o in the intel paragon parallel file system R Bordawekar O in the Intel Paragon Parallel File System, 1996 | 6 | 1996 |
Multiobjective evaluation and optimization of cmt-bone on multiple cpu/gpu systems M Gadou, T Banerjee, M Arunachalam, S Ranka Sustainable Computing: Informatics and Systems 22, 259-271, 2019 | 5 | 2019 |
Methods and apparatus for distributed training of a neural network M Arunachalam, ATR Rajan, D Karkada, A Procter, V Saletore US Patent App. 15/829,555, 2019 | 5 | 2019 |
Efficient k nearest neighbor algorithm implementations for throughput-oriented architectures J Ryoo, M Arunachalam, R Khanna, MT Kandemir 2018 19th international symposium on quality electronic design (isqed), 144-150, 2018 | 5 | 2018 |
Machine learning techniques for improved data prefetching D Guttman, MT Kandemir, M Arunachalam, R Khanna 5th International Conference on Energy Aware Computing Systems …, 2015 | 5 | 2015 |
Adaptive control of multiple prefetchers MA Kandaswamy, SC Steely US Patent App. 11/695,022, 2008 | 5 | 2008 |