Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1370 | 2023 |
Flexgen: High-throughput generative inference of large language models with a single gpu Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Ré, ... International Conference on Machine Learning, 31094-31116, 2023 | 187 | 2023 |
Distributed Deep Learning in Open Collaborations M Diskin*, A Bukhtiyarov*, M Ryabinin*, L Saulnier, Q Lhoest, A Sinitsin, ... Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 2021 | 48 | 2021 |
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts M Ryabinin, A Gusev Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 3659–3672, 2020 | 44 | 2020 |
Petals: Collaborative inference and fine-tuning of large models A Borzunov, D Baranchuk, T Dettmers, M Ryabinin, Y Belkada, ... arXiv preprint arXiv:2209.01188, 2022 | 39 | 2022 |
It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning A Tikhonov*, M Ryabinin* Findings of the ACL 2021, 3534–3546, 2021 | 30 | 2021 |
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices M Ryabinin*, E Gorbunov*, V Plokhotnyuk, G Pekhimenko Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 2021 | 27 | 2021 |
Scaling Ensemble Distribution Distillation to Many Classes With Proxy Targets M Ryabinin, A Malinin, M Gales Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 2021 | 20 | 2021 |
Distributed Inference and Fine-tuning of Large Language Models Over The Internet A Borzunov, M Ryabinin, A Chumachenko, D Baranchuk, T Dettmers, ... arXiv preprint arXiv:2312.08361, 2023 | 17 | 2023 |
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient M Ryabinin, T Dettmers, M Diskin, A Borzunov arXiv preprint arXiv:2301.11913, 2023 | 15 | 2023 |
RuCoLA: Russian corpus of linguistic acceptability V Mikhailov, T Shamardina, M Ryabinin, A Pestova, I Smurov, E Artemova arXiv preprint arXiv:2210.12814, 2022 | 15 | 2022 |
Distributed methods with compressed communication for solving variational inequalities, with theoretical guarantees A Beznosikov, P Richtárik, M Diskin, M Ryabinin, A Gasnikov Advances in Neural Information Processing Systems 35, 14013-14029, 2022 | 14 | 2022 |
Secure Distributed Training at Scale E Gorbunov, A Borzunov, M Diskin, M Ryabinin International Conference on Machine Learning, 7679-7739, 2022 | 14 | 2022 |
Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements A Voronov, L Wolf, M Ryabinin arXiv preprint arXiv:2401.06766, 2024 | 12 | 2024 |
Sequoia: Scalable, robust, and hardware-aware speculative decoding Z Chen, A May, R Svirschevski, Y Huang, M Ryabinin, Z Jia, B Chen arXiv preprint arXiv:2402.12374, 2024 | 10 | 2024 |
Embedding Words in Non-Vector Space with Unsupervised Graph Learning M Ryabinin, S Popov, L Prokhorenkova, E Voita Empirical Methods in Natural Language Processing (EMNLP 2020), 7317–7331, 2020 | 10 | 2020 |
Training Transformers Together A Borzunov, M Ryabinin, T Dettmers, Q Lhoest, L Saulnier, M Diskin, ... Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track 176 …, 2022 | 9 | 2022 |
Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics A Voronov, M Khoroshikh, A Babenko, M Ryabinin Advances in Neural Information Processing Systems 36, 2024 | 4* | 2024 |
The Hallucinations Leaderboard--An Open Effort to Measure Hallucinations in Large Language Models G Hong, AP Gema, R Saxena, X Du, P Nie, Y Zhao, L Perez-Beltrachini, ... arXiv preprint arXiv:2404.05904, 2024 | 1 | 2024 |
Adaptive Prediction Time for Sequence Classification M Ryabinin, E Lobacheva | 1 | 2018 |