Powerpack: Energy profiling and analysis of high-performance systems and applications R Ge, X Feng, S Song, HC Chang, D Li, KW Cameron IEEE Transactions on Parallel and Distributed Systems 21 (5), 658-671, 2009 | 534 | 2009 |
Superneurons: Dynamic GPU memory management for training deep neural networks L Wang, J Ye, Y Zhao, W Wu, A Li, SL Song, Z Xu, T Kraska Proceedings of the 23rd ACM SIGPLAN symposium on principles and practice of …, 2018 | 276 | 2018 |
Evaluating modern gpu interconnect: Pcie, nvlink, nv-sli, nvswitch and gpudirect A Li, SL Song, J Chen, J Li, X Liu, NR Tallent, KJ Barker IEEE Transactions on Parallel and Distributed Systems 31 (1), 94-110, 2019 | 241 | 2019 |
A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures S Song, C Su, B Rountree, KW Cameron 27th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2013 | 204 | 2013 |
Locality-driven dynamic GPU cache bypassing C Li, SL Song, H Dai, A Sidelnik, SKS Hari, H Zhou Proceedings of the 29th ACM on International Conference on Supercomputing, 67-77, 2015 | 139 | 2015 |
Graphreduce: processing large-scale graphs on accelerator-based systems D Sengupta, SL Song, K Agarwal, K Schwan Proceedings of the International Conference for High Performance Computing …, 2015 | 113 | 2015 |
Locality-aware CTA clustering for modern GPUs A Li, SL Song, W Liu, X Liu, A Kumar, H Corporaal ACM SIGARCH Computer Architecture News 45 (1), 297-311, 2017 | 90 | 2017 |
Iso-energy-efficiency: An approach to power-constrained parallel computation S Song, CY Su, R Ge, A Vishnu, KW Cameron 2011 IEEE International Parallel & Distributed Processing Symposium, 128-139, 2011 | 72 | 2011 |
Randomness in neural network training: Characterizing the impact of tooling D Zhuang, X Zhang, S Song, S Hooker Proceedings of Machine Learning and Systems 4, 316-336, 2022 | 70 | 2022 |
Energy profiling and analysis of the hpc challenge benchmarks S Song, R Ge, X Feng, KW Cameron The International Journal of High Performance Computing Applications 23 (3 …, 2009 | 67 | 2009 |
Tartan: evaluating modern GPU interconnect via a multi-GPU benchmark suite A Li, SL Song, J Chen, X Liu, N Tallent, K Barker 2018 IEEE International Symposium on Workload Characterization (IISWC), 191-202, 2018 | 63 | 2018 |
Exploring and analyzing the real impact of modern on-package memory on HPC scientific kernels A Li, W Liu, MRB Kristensen, B Vinter, H Wang, K Hou, A Marquez, ... Proceedings of the International Conference for High Performance Computing …, 2017 | 57 | 2017 |
Unified performance and power modeling of scientific workloads SL Song, K Barker, D Kerbyson Proceedings of the 1st International Workshop on Energy Efficient …, 2013 | 56 | 2013 |
Processing-in-memory enabled graphics processors for 3D rendering C Xie, SL Song, J Wang, W Zhang, X Fu 2017 IEEE International Symposium on High Performance Computer Architecture …, 2017 | 55 | 2017 |
New-sum: A novel online abft scheme for general iterative methods D Tao, SL Song, S Krishnamoorthy, P Wu, X Liang, EZ Zhang, ... Proceedings of the 25th ACM International Symposium on High-Performance …, 2016 | 54 | 2016 |
Investigating the interplay between energy efficiency and resilience in high performance computing L Tan, SL Song, P Wu, Z Chen, R Ge, DJ Kerbyson 2015 IEEE International Parallel and Distributed Processing Symposium, 786-796, 2015 | 53 | 2015 |
Cudaadvisor: Llvm-based runtime profiling for modern gpus D Shen, SL Song, A Li, X Liu Proceedings of the 2018 International Symposium on Code Generation and …, 2018 | 51 | 2018 |
Mic-svm: Designing a highly efficient support vector machine for advanced modern multi-core and many-core architectures Y You, SL Song, H Fu, A Marquez, MM Dehnavi, K Barker, KW Cameron, ... 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014 | 51 | 2014 |
LP-BNN: Ultra-low-latency BNN inference with layer parallelism T Geng, T Wang, C Wu, C Yang, SL Song, A Li, M Herbordt 2019 IEEE 30th International Conference on Application-specific Systems …, 2019 | 45 | 2019 |
Evograph: On-the-fly efficient mining of evolving graphs on gpu D Sengupta, SL Song High Performance Computing: 32nd International Conference, ISC High …, 2017 | 45 | 2017 |