TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory M Gao, J Pu, X Yang, M Horowitz, C Kozyrakis Proceedings of the Twenty-Second International Conference on Architectural …, 2017 | 661 | 2017 |
Practical Near-Data Processing for In-memory Analytics Frameworks M Gao, G Ayers, C Kozyrakis 2015 International Conference on Parallel Architecture and Compilation …, 2015 | 340 | 2015 |
Energy-Efficient Abundant-Data Computing: The N3XT 1,000 x MM Sabry Aly, M Gao, G Hills, CS Lee, G Pitner, MM Shulaker, TF Wu, ... Computer 48 (12), 24-33, 2015 | 284 | 2015 |
HRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing M Gao, C Kozyrakis 2016 IEEE International Symposium on High Performance Computer Architecture …, 2016 | 265 | 2016 |
Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators X Yang, M Gao, Q Liu, J Setter, J Pu, A Nayak, S Bell, K Cao, H Ha, ... Proceedings of the Twenty-Fifth International Conference on Architectural …, 2020 | 254 | 2020 |
GraphP: Reducing communication for PIM-based graph processing with efficient data partition M Zhang, Y Zhuo, C Wang, M Gao, Y Wu, K Chen, C Kozyrakis, X Qian 2018 IEEE International Symposium on High Performance Computer Architecture …, 2018 | 247 | 2018 |
Improving the accuracy, scalability, and performance of graph neural networks with ROC Z Jia, S Lin, M Gao, M Zaharia, A Aiken Proceedings of Machine Learning and Systems (MLSys), 187-198, 2020 | 230 | 2020 |
Tangram: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators M Gao, X Yang, J Pu, M Horowitz, C Kozyrakis International Conference on Architectural Support for Programming Languages …, 2019 | 179 | 2019 |
DNN Dataflow Choice Is Overrated X Yang, M Gao, J Pu, A Nayak, Q Liu, SE Bell, JO Setter, K Cao, H Ha, ... arXiv preprint arXiv:1809.04070, 2018 | 107 | 2018 |
Optimizing dnn computation with relaxed graph substitutions Z Jia, J Thomas, T Warszawski, M Gao, M Zaharia, A Aiken Proceedings of the 2nd Conference on Systems and Machine Learning (SysML’19), 2019 | 89 | 2019 |
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections H Wang, J Zhai, M Gao, Z Ma, S Tang, L Zheng, Y Li, K Rong, Y Chen, ... 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2021 | 62 | 2021 |
PipeZK: Accelerating Zero-Knowledge Proof with a Pipelined Architecture Y Zhang, S Wang, X Zhang, J Dong, X Mao, F Long, C Wang, D Zhou, ... | 53 | 2021 |
Reconfigurable logic architecture M Gao, H Zheng, KT Malladi, R Brennan US Patent 9,577,644, 2017 | 51 | 2017 |
ShEF: shielded enclaves for cloud FPGAs M Zhao, M Gao, C Kozyrakis Proceedings of the 27th ACM International Conference on Architectural …, 2022 | 40 | 2022 |
DRAF: A Low-Power DRAM-Based Reconfigurable Acceleration Fabric M Gao, C Delimitrou, D Niu, KT Malladi, H Zheng, B Brennan, C Kozyrakis Proceedings of the 43rd International Symposium on Computer Architecture …, 2016 | 38 | 2016 |
PPMLAC: high performance chipset architecture for secure multi-party computation X Zhou, Z Xu, C Wang, M Gao Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 22 | 2022 |
FINGERS: Exploiting Fine-Grained Parallelism in Graph Mining Accelerators Q Chen, B Tian, M Gao Proceedings of the 27th ACM International Conference on Architectural …, 2022 | 22 | 2022 |
Spada: Accelerating Sparse Matrix Multiplication with Adaptive Dataflow Z Li, J Li, T Chen, D Niu, H Zheng, Y Xie, M Gao Proceedings of the 28th ACM International Conference on Architectural …, 2023 | 21 | 2023 |
Special session paper 3D nanosystems enable embedded abundant-data computing W Hwang, MMS Aly, YH Malviya, M Gao, TF Wu, C Kozyrakis, HSP Wong, ... 2017 International Conference on Hardware/Software Codesign and System …, 2017 | 18* | 2017 |
GZKP: A GPU Accelerated Zero-Knowledge Proof System W Ma, Q Xiong, X Shi, X Ma, H Jin, H Kuang, M Gao, Y Zhang, H Shen, ... Proceedings of the 28th ACM International Conference on Architectural …, 2023 | 15 | 2023 |