Demystifying parallel and distributed deep learning: An in-depth concurrency analysis
Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …
applications. Accelerating their training is a major challenge and techniques range from …
A survey of techniques for optimizing deep learning on GPUs
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to
its unique features, the GPU continues to remain the most widely used accelerator for DL …
its unique features, the GPU continues to remain the most widely used accelerator for DL …
A linear speedup analysis of distributed deep learning with sparse and quantized communication
P Jiang, G Agrawal - Advances in Neural Information …, 2018 - proceedings.neurips.cc
The large communication overhead has imposed a bottleneck on the performance of
distributed Stochastic Gradient Descent (SGD) for training deep neural networks. Previous …
distributed Stochastic Gradient Descent (SGD) for training deep neural networks. Previous …
HPC cloud for scientific and business applications: taxonomy, vision, and research challenges
High performance computing (HPC) clouds are becoming an alternative to on-premise
clusters for executing scientific applications and business analytics services. Most research …
clusters for executing scientific applications and business analytics services. Most research …
Communication-efficient distributed deep learning: A comprehensive survey
Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
Deep learning application in plant stress imaging: a review
Plant stress is one of major issues that cause significant economic loss for growers. The
labor-intensive conventional methods for identifying the stressed plants constrain their …
labor-intensive conventional methods for identifying the stressed plants constrain their …
SparCML: High-performance sparse communication for machine learning
Applying machine learning techniques to the quickly growing data in science and industry
requires highly-scalable algorithms. Large datasets are most commonly processed" data …
requires highly-scalable algorithms. Large datasets are most commonly processed" data …
Advancements in accelerating deep neural network inference on aiot devices: A survey
The amalgamation of artificial intelligence with Internet of Things (AIoT) devices have seen a
rapid surge in growth, largely due to the effective implementation of deep neural network …
rapid surge in growth, largely due to the effective implementation of deep neural network …
The MVAPICH project: Transforming research into high-performance MPI library for HPC community
Abstract High-Performance Computing (HPC) research, from hardware and software to the
end applications, provides remarkable computing power to help scientists solve complex …
end applications, provides remarkable computing power to help scientists solve complex …
Performance modeling and evaluation of distributed deep learning frameworks on gpus
Deep learning frameworks have been widely deployed on GPU servers for deep learning
applications in both academia and industry. In training deep neural networks (DNNs), there …
applications in both academia and industry. In training deep neural networks (DNNs), there …