关注
Wonbeom Lee
Wonbeom Lee
在 snu.ac.kr 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
{InfiniGen}: Efficient Generative Inference of Large Language Models with Dynamic {KV} Cache Management
W Lee, J Lee, J Seo, J Sim
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
62024
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization
J Lee, W Lee, J Sim
arXiv preprint arXiv:2406.12930, 2024
22024
系统目前无法执行此操作,请稍后再试。
文章 1–2