关注
Bingyang Wu
标题
引用次数
引用次数
年份
AMOS: enabling automatic mapping for tensor computations on spatial accelerators with hardware abstraction
S Zheng, R Chen, A Wei, Y Jin, Q Han, L Lu, B Wu, X Li, S Yan, Y Liang
Proceedings of the 49th Annual International Symposium on Computer …, 2022
412022
Fast distributed inference serving for large language models
B Wu, Y Zhong, Z Zhang, G Huang, X Liu, X Jin
arXiv preprint arXiv:2305.05920, 2023
282023
A survey of resource-efficient llm and multimodal foundation models
M Xu, W Yin, D Cai, R Yi, D Xu, Q Wang, B Wu, Y Zhao, C Yang, S Wang, ...
arXiv preprint arXiv:2401.08092, 2024
222024
Transparent {GPU} sharing in container clouds for deep learning workloads
B Wu, Z Zhang, Z Bai, X Liu, X Jin
20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023
152023
Neoflow: A flexible framework for enabling efficient compilation for high performance dnn training
S Zheng, R Chen, Y Jin, A Wei, B Wu, X Li, S Yan, Y Liang
IEEE Transactions on Parallel and Distributed Systems 33 (11), 3220-3232, 2021
112021
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
B Wu, S Liu, Y Zhong, P Sun, X Liu, X Jin
arXiv preprint arXiv:2404.09526, 2024
32024
Xron: A hybrid elastic cloud overlay network for video conferencing at planetary scale
B Wu, K Qian, B Li, Y Ma, Q Zhang, Z Jiang, J Zhao, D Cai, E Zhai, X Liu, ...
Proceedings of the ACM SIGCOMM 2023 Conference, 696-709, 2023
32023
dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving
B Wu, R Zhu, Z Zhang, P Sun, X Liu, X Jin
系统目前无法执行此操作,请稍后再试。
文章 1–8