Cheng Li 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	1531	1113
h 指数	16	16
i10 指数	22	22

260

130

195

2014201520162017201820192020202120222023202416 39 98 112 130 143 175 172 147 259 216

开放获取的出版物数量

查看全部

10 篇文章

1 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Wen-mei W. HwuSenior Distinguished Research Scientist, NVIDIA; Professor and Sanders-AMD Chair of Electrical and在 illinois.edu 的电子邮件经过验证
Abdul DakkakModular在 modular.com 的电子邮件经过验证
Jinjun XiongUniversity at Buffalo在 buffalo.edu 的电子邮件经过验证
Michael LaurenzanoClinc, Inc.在 clinc.com 的电子邮件经过验证
Trevor MudgeBredt Family Professor of Engineering, University of Michigan在 eecs.umich.edu 的电子邮件经过验证
Ronald DreslinskiUniversity of Michigan在 umich.edu 的电子邮件经过验证
Jason MarsProfessor of Computer Science and Engineering, University of Michigan在 umich.edu 的电子邮件经过验证
Lingjia TangUniversity of Michigan在 umich.edu 的电子邮件经过验证
Johann HauswaldStanford, University of Michigan在 umich.edu 的电子邮件经过验证
Yunqi ZhangMeta, Inc在 umich.edu 的电子邮件经过验证
Vinicius PetrucciMicron Technology在 micron.com 的电子邮件经过验证
Yiping KangUniversity of Michigan在 umich.edu 的电子邮件经过验证
John P HayesProfessor of EECS, University of Michigan在 umich.edu 的电子邮件经过验证
Armin AlaghiUniversity of Washington在 cs.washington.edu 的电子邮件经过验证
Quan ChenProfessor, Shanghai Jiao Tong University在 sjtu.edu.cn 的电子邮件经过验证
Carl PearsonSandia National Labs在 sandia.gov 的电子邮件经过验证
Isaac GeladoNVIDIA在 gelado.org 的电子邮件经过验证
Li-Wen ChangResearch Scientist, ByteDance在 bytedance.com 的电子邮件经过验证
Izzat El HajjAmerican University of Beirut在 aub.edu.lb 的电子邮件经过验证
Juan Gómez LunaNVIDIA在 nvidia.com 的电子邮件经过验证

关注

Cheng Li

Microsoft

在 microsoft.com 的电子邮件经过验证 - 首页

AI Deep Learning Machine Learning GPU Parallel Computing


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Sirius: An open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers J Hauswald, MA Laurenzano, Y Zhang, C Li, A Rovinski, A Khurana, ... Proceedings of the Twentieth International Conference on Architectural …, 2015	337	2015
Stochastic circuits for real-time image-processing applications A Alaghi, C Li, JP Hayes Proceedings of the 50th Annual Design Automation Conference, 1-6, 2013	314	2013
Djinn and tonic: Dnn as a service and its implications for future warehouse scale computers J Hauswald, Y Kang, MA Laurenzano, Q Chen, C Li, T Mudge, ... ACM SIGARCH Computer Architecture News 43 (3S), 27-40, 2015	199	2015
Deepspeed-inference: enabling efficient inference of transformer models at unprecedented scale RY Aminabadi, S Rajbhandari, AA Awan, C Li, D Li, E Zheng, O Ruwase, ... SC22: International Conference for High Performance Computing, Networking …, 2022	168	2022
Accelerating reduction and scan using tensor core units A Dakkak, C Li, J Xiong, I Gelado, W Hwu Proceedings of the ACM International Conference on Supercomputing, 46-57, 2019	91	2019
KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism I El Hajj, J Gómez-Luna, C Li, LW Chang, D Milojicic, W Hwu 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture …, 2016	43	2016
Evaluating characteristics of CUDA communication primitives on high-bandwidth interconnects C Pearson, A Dakkak, S Hashash, C Li, IH Chung, J Xiong, WM Hwu Proceedings of the 2019 ACM/SPEC International Conference on Performance …, 2019	37	2019
XSP: Across-stack profiling and analysis of machine learning models on GPUs C Li, A Dakkak, J Xiong, W Wei, L Xu, W Hwu 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020	32*	2020
Designing future warehouse-scale computers for sirius, an end-to-end voice and vision personal assistant J Hauswald, MA Laurenzano, Y Zhang, H Yang, Y Kang, C Li, A Rovinski, ... ACM Transactions on Computer Systems (TOCS) 34 (1), 1-32, 2016	32	2016
A comprehensive study on post-training quantization for large language models Z Yao, C Li, X Wu, S Youn, Y He arXiv preprint arXiv:2303.08302, 2023	30	2023
Trims: Transparent and isolated model sharing for low latency deep learning inference in function-as-a-service A Dakkak, C Li, SG De Gonzalo, J Xiong, W Hwu 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), 372-382, 2019	30	2019
Zeroquant-v2: Exploring post-training quantization in llms from comprehensive study to low rank compensation Z Yao, X Wu, C Li, S Youn, Y He arXiv preprint arXiv:2303.08302, 2023	23	2023
Ai matrix: A deep learning benchmark for alibaba data centers W Zhang, W Wei, L Xu, L Jin, C Li arXiv preprint arXiv:1909.10562, 2019	21	2019
Understanding int4 quantization for transformer models: Latency speedup, composability, and failure cases X Wu, C Li, RY Aminabadi, Z Yao, Y He arXiv preprint arXiv:2301.12017, 2023	19	2023
Frustrated with replicating claims of a shared model? a solution A Dakkak, C Li, J Xiong, WM Hwu arXiv preprint arXiv:1811.09737, 2018	16*	2018
Matrix factorization on gpus with memory optimization and approximate computing W Tan, S Chang, L Fong, C Li, Z Wang, L Cao Proceedings of the 47th International Conference on Parallel Processing, 1-10, 2018	16	2018
Acm Y Wang, W Feng, Y Chen, H Yu, M Huang, PS Yu Visual Domain Adaptation with Manifold Embedded Distribution Alignment, 402-410, 2018	15	2018
Understanding int4 quantization for language models: latency speedup, composability, and failure cases X Wu, C Li, RY Aminabadi, Z Yao, Y He International Conference on Machine Learning, 37524-37539, 2023	11	2023
Mpress: Democratizing billion-scale model training on multi-gpu servers via memory-saving inter-operator parallelism Q Zhou, H Wang, X Yu, C Li, Y Bai, F Yan, Y Xu 2023 IEEE International Symposium on High-Performance Computer Architecture …, 2023	11	2023
Random-ltd: Random and layerwise token dropping brings efficient training for large-scale transformers Z Yao, X Wu, C Li, C Holmes, M Zhang, C Li, Y He arXiv preprint arXiv:2211.11586, 2022	11	2022

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用