An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness S Hong, H Kim Proceedings of the 36th annual international symposium on Computer …, 2009 | 889 | 2009 |
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping CK Luk, S Hong, H Kim Proceedings of the 42nd Annual IEEE/ACM international symposium on …, 2009 | 758 | 2009 |
An integrated GPU power and performance model S Hong, H Kim Proceedings of the 37th annual international symposium on Computer …, 2010 | 718 | 2010 |
Feedback directed prefetching: Improving the performance and bandwidth-efficiency of hardware prefetchers S Srinath, O Mutlu, H Kim, YN Patt 2007 IEEE 13th International Symposium on High Performance Computer …, 2007 | 455 | 2007 |
Graphpim: Enabling instruction-level pim offloading in graph computing frameworks L Nai, R Hadidi, J Sim, H Kim, P Kumar, H Kim 2017 IEEE International symposium on high performance computer architecture …, 2017 | 331 | 2017 |
A performance analysis framework for identifying potential benefits in GPGPU applications J Sim, A Dasgupta, H Kim, R Vuduc Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of …, 2012 | 269 | 2012 |
Identification of high-risk plaques destined to cause acute coronary syndrome using coronary computed tomographic angiography and computational fluid dynamics JM Lee, G Choi, BK Koo, D Hwang, J Park, J Zhang, KJ Kim, Y Tong, ... Cardiovascular Imaging 12 (6), 1032-1043, 2019 | 252 | 2019 |
Many-thread aware prefetching mechanisms for GPGPU applications J Lee, NB Lakshminarayana, H Kim, R Vuduc 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 213-224, 2010 | 190 | 2010 |
When prefetching works, when it doesn’t, and why J Lee, H Kim, R Vuduc ACM Transactions on Architecture and Code Optimization (TACO) 9 (1), 1-29, 2012 | 189 | 2012 |
GraphBIG: understanding graph computing in the context of industrial solutions L Nai, Y Xia, IG Tanase, H Kim, CY Lin Proceedings of the International Conference for High Performance Computing …, 2015 | 185 | 2015 |
TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture J Lee, H Kim IEEE International Symposium on High-Performance Comp Architecture, 1-12, 2012 | 168 | 2012 |
Transparent hardware management of stacked dram as part of memory J Sim, AR Alameldeen, Z Chishti, C Wilkerson, H Kim 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, 13-24, 2014 | 151 | 2014 |
SD3: A scalable approach to dynamic data-dependence profiling M Kim, H Kim, CK Luk 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 535-546, 2010 | 143 | 2010 |
Characterizing the deployment of deep neural networks on commercial edge devices R Hadidi, J Cao, Y Xie, B Asgari, T Krishna, H Kim 2019 IEEE International Symposium on Workload Characterization (IISWC), 35-48, 2019 | 133 | 2019 |
Techniques for efficient processing in runahead execution engines O Mutlu, H Kim, YN Patt 32nd International Symposium on Computer Architecture (ISCA'05), 370-381, 2005 | 133 | 2005 |
A mostly-clean DRAM cache for effective hit speculation and self-balancing dispatch J Sim, GH Loh, H Kim, M OConnor, M Thottethodi 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, 247-257, 2012 | 126 | 2012 |
Age based scheduling for asymmetric multiprocessors NB Lakshminarayana, J Lee, H Kim Proceedings of the conference on high performance computing networking …, 2009 | 108 | 2009 |
Efficient runahead execution: Power-efficient memory latency tolerance O Mutlu, H Kim, YN Patt IEEE Micro 26 (1), 10-20, 2006 | 94 | 2006 |
Distributed perception by collaborative robots R Hadidi, J Cao, M Woodward, MS Ryoo, H Kim IEEE Robotics and Automation Letters 3 (4), 3709-3716, 2018 | 92 | 2018 |
Power modeling for GPU architectures using McPAT J Lim, NB Lakshminarayana, H Kim, W Song, S Yalamanchili, W Sung ACM Transactions on Design Automation of Electronic Systems (TODAES) 19 (3 …, 2014 | 90 | 2014 |