关注
Cheng Li
Cheng Li
在 microsoft.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Sirius: An open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers
J Hauswald, MA Laurenzano, Y Zhang, C Li, A Rovinski, A Khurana, ...
Proceedings of the Twentieth International Conference on Architectural …, 2015
3372015
Stochastic circuits for real-time image-processing applications
A Alaghi, C Li, JP Hayes
Proceedings of the 50th Annual Design Automation Conference, 1-6, 2013
3142013
Djinn and tonic: Dnn as a service and its implications for future warehouse scale computers
J Hauswald, Y Kang, MA Laurenzano, Q Chen, C Li, T Mudge, ...
ACM SIGARCH Computer Architecture News 43 (3S), 27-40, 2015
1992015
Deepspeed-inference: enabling efficient inference of transformer models at unprecedented scale
RY Aminabadi, S Rajbhandari, AA Awan, C Li, D Li, E Zheng, O Ruwase, ...
SC22: International Conference for High Performance Computing, Networking …, 2022
1682022
Accelerating reduction and scan using tensor core units
A Dakkak, C Li, J Xiong, I Gelado, W Hwu
Proceedings of the ACM International Conference on Supercomputing, 46-57, 2019
912019
KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism
I El Hajj, J Gómez-Luna, C Li, LW Chang, D Milojicic, W Hwu
2016 49th Annual IEEE/ACM International Symposium on Microarchitecture …, 2016
432016
Evaluating characteristics of CUDA communication primitives on high-bandwidth interconnects
C Pearson, A Dakkak, S Hashash, C Li, IH Chung, J Xiong, WM Hwu
Proceedings of the 2019 ACM/SPEC International Conference on Performance …, 2019
372019
XSP: Across-stack profiling and analysis of machine learning models on GPUs
C Li, A Dakkak, J Xiong, W Wei, L Xu, W Hwu
2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020
32*2020
Designing future warehouse-scale computers for sirius, an end-to-end voice and vision personal assistant
J Hauswald, MA Laurenzano, Y Zhang, H Yang, Y Kang, C Li, A Rovinski, ...
ACM Transactions on Computer Systems (TOCS) 34 (1), 1-32, 2016
322016
A comprehensive study on post-training quantization for large language models
Z Yao, C Li, X Wu, S Youn, Y He
arXiv preprint arXiv:2303.08302, 2023
302023
Trims: Transparent and isolated model sharing for low latency deep learning inference in function-as-a-service
A Dakkak, C Li, SG De Gonzalo, J Xiong, W Hwu
2019 IEEE 12th International Conference on Cloud Computing (CLOUD), 372-382, 2019
302019
Zeroquant-v2: Exploring post-training quantization in llms from comprehensive study to low rank compensation
Z Yao, X Wu, C Li, S Youn, Y He
arXiv preprint arXiv:2303.08302, 2023
232023
Ai matrix: A deep learning benchmark for alibaba data centers
W Zhang, W Wei, L Xu, L Jin, C Li
arXiv preprint arXiv:1909.10562, 2019
212019
Understanding int4 quantization for transformer models: Latency speedup, composability, and failure cases
X Wu, C Li, RY Aminabadi, Z Yao, Y He
arXiv preprint arXiv:2301.12017, 2023
192023
Frustrated with replicating claims of a shared model? a solution
A Dakkak, C Li, J Xiong, WM Hwu
arXiv preprint arXiv:1811.09737, 2018
16*2018
Matrix factorization on gpus with memory optimization and approximate computing
W Tan, S Chang, L Fong, C Li, Z Wang, L Cao
Proceedings of the 47th International Conference on Parallel Processing, 1-10, 2018
162018
Acm
Y Wang, W Feng, Y Chen, H Yu, M Huang, PS Yu
Visual Domain Adaptation with Manifold Embedded Distribution Alignment, 402-410, 2018
152018
Understanding int4 quantization for language models: latency speedup, composability, and failure cases
X Wu, C Li, RY Aminabadi, Z Yao, Y He
International Conference on Machine Learning, 37524-37539, 2023
112023
Mpress: Democratizing billion-scale model training on multi-gpu servers via memory-saving inter-operator parallelism
Q Zhou, H Wang, X Yu, C Li, Y Bai, F Yan, Y Xu
2023 IEEE International Symposium on High-Performance Computer Architecture …, 2023
112023
Random-ltd: Random and layerwise token dropping brings efficient training for large-scale transformers
Z Yao, X Wu, C Li, C Holmes, M Zhang, C Li, Y He
arXiv preprint arXiv:2211.11586, 2022
112022
系统目前无法执行此操作,请稍后再试。
文章 1–20