Online credit card fraud detection: a hybrid framework with big data technologies Y Dai, J Yan, X Tang, H Zhao, M Guo 2016 IEEE Trustcom/BigDataSE/ISPA, 1644-1651, 2016 | 55 | 2016 |
Towards scalable and reliable in-memory storage system: A case study with Redis S Chen, X Tang, H Wang, H Zhao, M Guo 2016 IEEE Trustcom/BigDataSE/ISPA, 1660-1667, 2016 | 50 | 2016 |
Enable simultaneous dnn services based on deterministic operator overlap and precise latency prediction W Cui, H Zhao, Q Chen, N Zheng, J Leng, J Zhao, Z Song, T Ma, Y Yang, ... Proceedings of the International Conference for High Performance Computing …, 2021 | 49 | 2021 |
{DVABatch}: Diversity-aware {Multi-Entry}{Multi-Exit} batching for efficient processing of {DNN} services on {GPUs} W Cui, H Zhao, Q Chen, H Wei, Z Li, D Zeng, C Li, M Guo 2022 USENIX Annual Technical Conference (USENIX ATC 22), 183-198, 2022 | 37 | 2022 |
Tacker: Tensor-cuda core kernel fusion for improving the gpu utilization while ensuring qos H Zhao, W Cui, Q Chen, Y Zhang, Y Lu, C Li, J Leng, M Guo 2022 IEEE International Symposium on High-Performance Computer Architecture …, 2022 | 20 | 2022 |
E2bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services W Cui, Q Chen, H Zhao, M Wei, X Tang, M Guo IEEE Transactions on Parallel and Distributed Systems 32 (6), 1307-1321, 2020 | 19 | 2020 |
CODA: Improving resource utilization by slimming and co-locating DNN and CPU jobs H Zhao, W Cui, Q Chen, J Leng, K Yu, D Zeng, C Li, M Guo 2020 IEEE 40th International Conference on Distributed Computing Systems …, 2020 | 14 | 2020 |
Bandwidth and locality aware task-stealing for manycore architectures with bandwidth-asymmetric memory H Zhao, Q Chen, Y Qiu, M Wu, Y Shen, J Leng, C Li, M Guo ACM Transactions on Architecture and Code Optimization (TACO) 15 (4), 1-26, 2018 | 12 | 2018 |
Exploiting intra-sm parallelism in gpus via persistent and elastic blocks H Zhao, W Cui, Q Chen, J Zhao, J Leng, M Guo 2021 IEEE 39th International Conference on Computer Design (ICCD), 290-298, 2021 | 10 | 2021 |
ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-Grained Resource Management H Zhao, W Cui, Q Chen, M Guo IEEE Transactions on Computers 72 (5), 1473-1487, 2022 | 2 | 2022 |
A Codesign of Scheduling and Parallelization for Large Model Training in Heterogeneous Clusters C Xue, W Cui, H Zhao, Q Chen, S Zhang, P Yang, J Yang, S Li, M Guo arXiv preprint arXiv:2403.16125, 2024 | 1 | 2024 |
Improving Cluster Utilization through Adaptive Resource Management for DNN and CPU Jobs Co-location H Zhao, W Cui, Q Chen, J Leng, D Zeng, M Guo IEEE Transactions on Computers, 2023 | 1 | 2023 |
Potamoi: Accelerating Neural Rendering via a Unified Streaming Architecture Y Feng, W Lin, Z Liu, J Leng, M Guo, H Zhao, X Hou, J Zhao, Y Zhu ACM Transactions on Architecture and Code Optimization, 2024 | | 2024 |
FaaSMem: Improving Memory Efficiency of Serverless Computing with Memory Pool Architecture C Xu, Y Liu, Z Li, Q Chen, H Zhao, D Zeng, Q Peng, X Wu, H Zhao, S Fu, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024 | | 2024 |
Towards Fast Setup and High Throughput of GPU Serverless Computing H Zhao, W Cui, Q Chen, S Zhang, Z Li, J Leng, C Li, D Zeng, M Guo arXiv preprint arXiv:2404.14691, 2024 | | 2024 |
Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo B Chen, H Zhao, W Cui, Y He, S Zhang, Q Chen, Z Li, M Guo Proceedings of the 2023 ACM Symposium on Cloud Computing, 265-280, 2023 | | 2023 |