关注
Ruslan Svirschevski
Ruslan Svirschevski
Yandex
在 yandex-team.ru 的电子邮件经过验证
标题
引用次数
引用次数
年份
Spqr: A sparse-quantized representation for near-lossless llm weight compression
T Dettmers, R Svirschevski, V Egiazarian, D Kuznedelev, E Frantar, ...
arXiv preprint arXiv:2306.03078, 2023
1242023
Sequoia: Scalable, robust, and hardware-aware speculative decoding
Z Chen, A May, R Svirschevski, Y Huang, M Ryabinin, Z Jia, B Chen
arXiv preprint arXiv:2402.12374, 2024
102024
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices
R Svirschevski, A May, Z Chen, B Chen, Z Jia, M Ryabinin
arXiv preprint arXiv:2406.02532, 2024
12024
Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization
V Egiazarian, D Kuznedelev, A Voronov, R Svirschevski, M Goin, ...
arXiv preprint arXiv:2409.00492, 2024
2024
Privacy Preserving API Fine-tuning for LLMs
P Zmushko, M Mansurov, R Svirschevski, D Kuznedelev, M Ryabinin, ...
系统目前无法执行此操作,请稍后再试。
文章 1–5