Communication-efficient distributed deep learning: A comprehensive survey
Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
Edge learning: The enabling technology for distributed big data analytics in the edge
Machine Learning (ML) has demonstrated great promise in various fields, eg, self-driving,
smart city, which are fundamentally altering the way individuals and organizations live, work …
smart city, which are fundamentally altering the way individuals and organizations live, work …
Towards scalable distributed training of deep learning on public cloud clusters
Distributed training techniques have been widely deployed in large-scale deep models
training on dense-GPU clusters. However, on public cloud clusters, due to the moderate …
training on dense-GPU clusters. However, on public cloud clusters, due to the moderate …
Geryon: Accelerating distributed CNN training by network-level flow scheduling
Increasingly rich data sets and complicated models make distributed machine learning more
and more important. However, the cost of extensive and frequent parameter …
and more important. However, the cost of extensive and frequent parameter …
Communication optimization strategies for distributed deep neural network training: A survey
Recent trends in high-performance computing and deep learning have led to the
proliferation of studies on large-scale deep neural network training. However, the frequent …
proliferation of studies on large-scale deep neural network training. However, the frequent …
Enabling all in-edge deep learning: A literature review
In recent years, deep learning (DL) models have demonstrated remarkable achievements
on non-trivial tasks such as speech recognition, image processing, and natural language …
on non-trivial tasks such as speech recognition, image processing, and natural language …
Distributed learning systems with first-order methods
Scalable and efficient distributed learning is one of the main driving forces behind the recent
rapid advancement of machine learning and artificial intelligence. One prominent feature of …
rapid advancement of machine learning and artificial intelligence. One prominent feature of …
FedBC: blockchain-based decentralized federated learning
X Wu, Z Wang, J Zhao, Y Zhang… - 2020 IEEE international …, 2020 - ieeexplore.ieee.org
Federated learning enables participants to collaborate on model training without directly
exchanging raw data. Existing federated learning methods often follow the parameter server …
exchanging raw data. Existing federated learning methods often follow the parameter server …
Robust searching-based gradient collaborative management in intelligent transportation system
With the rapid development of big data and the Internet of Things (IoT), traffic data from an
Intelligent Transportation System (ITS) is becoming more and more accessible. To …
Intelligent Transportation System (ITS) is becoming more and more accessible. To …
Peta-scale embedded photonics architecture for distributed deep learning applications
As Deep Learning (DL) models grow larger and more complex, training jobs are
increasingly distributed across multiple Computing Units (CU) such as GPUs and TPUs …
increasingly distributed across multiple Computing Units (CU) such as GPUs and TPUs …