Efficient sparse matrix-vector multiplication on x86-based many-core processors X Liu, M Smelyanskiy, E Chow, P Dubey Proceedings of the 27th international ACM conference on International …, 2013 | 330 | 2013 |
FROSTT: The formidable repository of open sparse tensors and tools S Smith, JW Choi, J Li, R Vuduc, J Park, X Liu, G Karypis | 159 | 2017 |
Algorithmic time, energy, and power on candidate HPC compute building blocks J Choi, M Dukhan, X Liu, R Vuduc 2014 IEEE 28th international parallel and distributed processing symposium …, 2014 | 100 | 2014 |
Software-hardware co-design for fast and scalable training of deep learning recommendation models D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 80 | 2022 |
Tt-rec: Tensor train compression for deep learning recommendation models C Yin, B Acun, CJ Wu, X Liu Proceedings of Machine Learning and Systems 3, 448-462, 2021 | 74 | 2021 |
Efficient shared-memory implementation of high-performance conjugate gradient benchmark and its application to unstructured matrices J Park, M Smelyanskiy, K Vaidyanathan, A Heinecke, DD Kalamkar, X Liu, ... SC'14: Proceedings of the International Conference for High Performance …, 2014 | 67 | 2014 |
Truss decomposition on shared-memory parallel systems S Smith, X Liu, NK Ahmed, AS Tom, F Petrini, G Karypis 2017 IEEE High Performance Extreme Computing Conference (HPEC), 1-6, 2017 | 60 | 2017 |
Optimizing sparse matrix-vector multiplication for large-scale data analytics D Buono, F Petrini, F Checconi, X Liu, X Que, C Long, TC Tuan Proceedings of the 2016 International Conference on Supercomputing, 1-12, 2016 | 59 | 2016 |
Blocking Optimization Techniques for Sparse Tensor Computation J Choi, X Liu, S Smith, T Simon 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018 | 36 | 2018 |
Parallel scalability of Hartree–Fock calculations E Chow, X Liu, M Smelyanskiy, JR Hammond The Journal of chemical physics 142 (10), 2015 | 35 | 2015 |
On optimizing distributed tucker decomposition for dense tensors VT Chakaravarthy, JW Choi, DJ Joseph, X Liu, P Murali, Y Sabharwal, ... Parallel and Distributed Processing Symposium (IPDPS), 2017 IEEE …, 2017 | 34 | 2017 |
A new scalable parallel algorithm for Fock matrix construction X Liu, A Patel, E Chow 2014 IEEE 28th international parallel and distributed processing symposium …, 2014 | 34 | 2014 |
Improving the performance of dynamical simulations via multiple right-hand sides X Liu, E Chow, K Vaidyanathan, M Smelyanskiy 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 33 | 2012 |
Towards compact neural networks via end-to-end training: A bayesian tensor approach with automatic rank determination C Hawkins, X Liu, Z Zhang SIAM Journal on Mathematics of Data Science 4 (1), 46-71, 2022 | 32 | 2022 |
High-performance, distributed training of large-scale deep learning recommendation models D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ... arXiv preprint arXiv:2104.05158, 2021 | 32 | 2021 |
High-performance dense tucker decomposition on GPU clusters J Choi, X Liu, V Chakaravarthy SC18: International Conference for High Performance Computing, Networking …, 2018 | 32 | 2018 |
Scaling up Hartree–Fock Calculations on Tianhe-2 E Chow, X Liu, S Misra, M Dukhan, M Smelyanskiy, JR Hammond, Y Du, ... International Journal of High Performance Computing Applications, 2015 | 28 | 2015 |
Genome sequences for five strains of the emerging pathogen Haemophilus haemolyticus IK Jordan, AB Conley, IV Antonov, RA Arthur, ED Cook, GP Cooper, ... Journal of Bacteriology 193 (20), 5879-5880, 2011 | 28 | 2011 |
Picture processing via a shared decoded picture pool Y Yuan, R Yan, S Xu, X Liu, HD Li US Patent 8,300,704, 2012 | 27 | 2012 |
A Sparse Direct Solver for Distributed Memory Xeon Phi-accelerated Systems P Sao, X Liu, R Vuduc, X Li Parallel and Distributed Processing Symposium, 2015 IEEE 29th International …, 2015 | 25 | 2015 |