- 学术资源搜索

Communication-efficient distributed deep learning: A comprehensive survey

Z Tang, S Shi, W Wang, B Li, X Chu - arXiv preprint arXiv:2003.06307, 2020 - arxiv.org

Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …

被引用次数：146 相关文章所有 4 个版本

[PDF] neurips.cc

Sparsified SGD with memory

SU Stich, JB Cordonnier… - Advances in neural …, 2018 - proceedings.neurips.cc

Huge scale machine learning problems are nowadays tackled by distributed optimization
algorithms, ie algorithms that leverage the compute power of many devices for training. The …

被引用次数：882 相关文章所有 10 个版本

[PDF] jmlr.org

CoCoA: A general framework for communication-efficient distributed optimization

V Smith, S Forte, C Ma, M Takáč, MI Jordan… - Journal of Machine …, 2018 - jmlr.org

The scale of modern datasets necessitates the development of efficient distributed
optimization methods for machine learning. We present a general-purpose framework for …

被引用次数：318 相关文章所有 10 个版本

[PDF] arxiv.org

Fusionai: Decentralized training and deploying llms with massive consumer-level gpus

Z Tang, Y Wang, X He, L Zhang, X Pan, Q Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

The rapid growth of memory and computation requirements of large language models
(LLMs) has outpaced the development of hardware, hindering people who lack large-scale …

被引用次数：19 相关文章所有 2 个版本

[PDF] mlr.press

Efficient greedy coordinate descent for composite problems

SP Karimireddy, A Koloskova… - The 22nd …, 2019 - proceedings.mlr.press

Coordinate descent with random coordinate selection is the current state of the art for many
large scale optimization problems. However, greedy selection of the steepest coordinate on …

被引用次数：32 相关文章所有 4 个版本

[PDF] neurips.cc

Snap ML: A hierarchical framework for machine learning

C Dünner, T Parnell, D Sarigiannis… - Advances in …, 2018 - proceedings.neurips.cc

We describe a new software framework for fast training of generalized linear models. The
framework, named Snap Machine Learning (Snap ML), combines recent advances in …

被引用次数：33 相关文章所有 8 个版本

[PDF] neurips.cc

Coordinate descent with bandit sampling

F Salehi, P Thiran, E Celis - Advances in Neural …, 2018 - proceedings.neurips.cc

Coordinate descent methods minimize a cost function by updating a single decision variable
(corresponding to one coordinate) at a time. Ideally, we would update the decision variable …

被引用次数：21 相关文章所有 7 个版本

[PDF] arxiv.org

FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression

Z Tang, X Kang, Y Yin, X Pan, Y Wang, X He… - arXiv preprint arXiv …, 2024 - arxiv.org

To alleviate hardware scarcity in training large deep neural networks (DNNs), particularly
large language models (LLMs), we present FusionLLM, a decentralized training system …

[PDF] openreview.net

Faster training by selecting samples using embeddings

S Gonzalez, J Landgraf… - 2019 International Joint …, 2019 - ieeexplore.ieee.org

Long training times have increasingly become a burden for researchers by slowing down
the pace of innovation, with some models taking days or weeks to train. In this paper, a new …

被引用次数：8 相关文章所有 6 个版本

[PDF] neurips.cc

SySCD: A system-aware parallel coordinate descent algorithm

N Ioannou, C Mendler-Dünner… - Advances in Neural …, 2019 - proceedings.neurips.cc

In this paper we propose a novel parallel stochastic coordinate descent (SCD) algorithm
with convergence guarantees that exhibits strong scalability. We start by studying a state-of …

被引用次数：4 相关文章所有 7 个版本