关注
Han Zhao
Han Zhao
在 sjtu.edu.cn 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Online credit card fraud detection: a hybrid framework with big data technologies
Y Dai, J Yan, X Tang, H Zhao, M Guo
2016 IEEE Trustcom/BigDataSE/ISPA, 1644-1651, 2016
552016
Towards scalable and reliable in-memory storage system: A case study with Redis
S Chen, X Tang, H Wang, H Zhao, M Guo
2016 IEEE Trustcom/BigDataSE/ISPA, 1660-1667, 2016
502016
Enable simultaneous dnn services based on deterministic operator overlap and precise latency prediction
W Cui, H Zhao, Q Chen, N Zheng, J Leng, J Zhao, Z Song, T Ma, Y Yang, ...
Proceedings of the International Conference for High Performance Computing …, 2021
492021
{DVABatch}: Diversity-aware {Multi-Entry}{Multi-Exit} batching for efficient processing of {DNN} services on {GPUs}
W Cui, H Zhao, Q Chen, H Wei, Z Li, D Zeng, C Li, M Guo
2022 USENIX Annual Technical Conference (USENIX ATC 22), 183-198, 2022
372022
Tacker: Tensor-cuda core kernel fusion for improving the gpu utilization while ensuring qos
H Zhao, W Cui, Q Chen, Y Zhang, Y Lu, C Li, J Leng, M Guo
2022 IEEE International Symposium on High-Performance Computer Architecture …, 2022
202022
E2bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services
W Cui, Q Chen, H Zhao, M Wei, X Tang, M Guo
IEEE Transactions on Parallel and Distributed Systems 32 (6), 1307-1321, 2020
192020
CODA: Improving resource utilization by slimming and co-locating DNN and CPU jobs
H Zhao, W Cui, Q Chen, J Leng, K Yu, D Zeng, C Li, M Guo
2020 IEEE 40th International Conference on Distributed Computing Systems …, 2020
142020
Bandwidth and locality aware task-stealing for manycore architectures with bandwidth-asymmetric memory
H Zhao, Q Chen, Y Qiu, M Wu, Y Shen, J Leng, C Li, M Guo
ACM Transactions on Architecture and Code Optimization (TACO) 15 (4), 1-26, 2018
122018
Exploiting intra-sm parallelism in gpus via persistent and elastic blocks
H Zhao, W Cui, Q Chen, J Zhao, J Leng, M Guo
2021 IEEE 39th International Conference on Computer Design (ICCD), 290-298, 2021
102021
ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-Grained Resource Management
H Zhao, W Cui, Q Chen, M Guo
IEEE Transactions on Computers 72 (5), 1473-1487, 2022
22022
A Codesign of Scheduling and Parallelization for Large Model Training in Heterogeneous Clusters
C Xue, W Cui, H Zhao, Q Chen, S Zhang, P Yang, J Yang, S Li, M Guo
arXiv preprint arXiv:2403.16125, 2024
12024
Improving Cluster Utilization through Adaptive Resource Management for DNN and CPU Jobs Co-location
H Zhao, W Cui, Q Chen, J Leng, D Zeng, M Guo
IEEE Transactions on Computers, 2023
12023
Potamoi: Accelerating Neural Rendering via a Unified Streaming Architecture
Y Feng, W Lin, Z Liu, J Leng, M Guo, H Zhao, X Hou, J Zhao, Y Zhu
ACM Transactions on Architecture and Code Optimization, 2024
2024
FaaSMem: Improving Memory Efficiency of Serverless Computing with Memory Pool Architecture
C Xu, Y Liu, Z Li, Q Chen, H Zhao, D Zeng, Q Peng, X Wu, H Zhao, S Fu, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2024
2024
Towards Fast Setup and High Throughput of GPU Serverless Computing
H Zhao, W Cui, Q Chen, S Zhang, Z Li, J Leng, C Li, D Zeng, M Guo
arXiv preprint arXiv:2404.14691, 2024
2024
Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo
B Chen, H Zhao, W Cui, Y He, S Zhang, Q Chen, Z Li, M Guo
Proceedings of the 2023 ACM Symposium on Cloud Computing, 265-280, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–16