A quantitative analysis on microarchitectures of modern CPU-FPGA platforms Y Choi, J Cong, Z Fang, Y Hao, G Reinman, P Wei Proceedings of the 53rd Annual Design Automation Conference, 1-6, 2016 | 187 | 2016 |
Supporting address translation for accelerator-centric architectures Y Hao, Z Fang, G Reinman, J Cong 2017 IEEE International Symposium on High Performance Computer Architecture …, 2017 | 122 | 2017 |
Pytorch fsdp: experiences on scaling fully sharded data parallel Y Zhao, A Gu, R Varma, L Luo, CC Huang, M Xu, L Wright, H Shojanazeri, ... arXiv preprint arXiv:2304.11277, 2023 | 105 | 2023 |
Software-hardware co-design for fast and scalable training of deep learning recommendation models D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 82 | 2022 |
In-depth analysis on microarchitectures of modern heterogeneous CPU-FPGA platforms YK Choi, J Cong, Z Fang, Y Hao, G Reinman, P Wei ACM Transactions on Reconfigurable Technology and Systems (TRETS) 12 (1), 1-20, 2019 | 53 | 2019 |
Hardware acceleration for an accurate stereo vision system using mini-census adaptive support region Y Shan, Y Hao, W Wang, Y Wang, X Chen, H Yang, W Luk ACM Transactions on Embedded Computing Systems (TECS) 13 (4s), 1-24, 2014 | 46 | 2014 |
On-chip interconnection network for accelerator-rich architectures J Cong, M Gill, Y Hao, G Reinman, B Yuan Proceedings of the 52nd Annual Design Automation Conference, 1-6, 2015 | 40 | 2015 |
Best-effort FPGA programming: A few steps can go a long way J Cong, Z Fang, Y Hao, P Wei, CH Yu, C Zhang, P Zhou arXiv preprint arXiv:1807.01340, 2018 | 33 | 2018 |
FPGA based memory efficient high resolution stereo vision system for video tolling Y Shan, Z Wang, W Wang, Y Hao, Y Wang, K Tsoi, W Luk, H Yang 2012 International Conference on Field-Programmable Technology, 29-32, 2012 | 19 | 2012 |
Mtia: First generation silicon targeting meta's recommendation systems A Firoozshahian, J Coburn, R Levenstein, R Nattoji, A Kamath, O Wu, ... Proceedings of the 50th Annual International Symposium on Computer …, 2023 | 12 | 2023 |
DHEN: A deep and hierarchical ensemble network for large-scale click-through rate prediction B Zhang, L Luo, X Liu, J Li, Z Chen, W Zhang, X Wei, Y Hao, M Tsang, ... arXiv preprint arXiv:2203.11014, 2022 | 12 | 2022 |
Software-hardware co-design of heterogeneous SmartNIC system for recommendation models inference and training A Guo, Y Hao, C Wu, P Haghi, Z Pan, M Si, D Tao, A Li, M Herbordt, ... Proceedings of the 37th International Conference on Supercomputing, 336-347, 2023 | 10 | 2023 |
Reconfigurable Accelerator Compute Hierarchy: A Case Study using Content-Based Image Retrieval N Farahpour, Y Hao, Z Fang, G Reinman 2020 IEEE International Symposium on Workload Characterization (IISWC), 276-287, 2020 | 1 | 2020 |
Architectural Techniques to Enhance the Efficiency of Accelerator-Centric Architectures Y Hao University of California, Los Angeles, 2018 | | 2018 |