Convergence of edge computing and deep learning: A comprehensive survey
Ubiquitous sensors and smart devices from factories and communities are generating
massive amounts of data, and ever-increasing computing power is driving the core of …
massive amounts of data, and ever-increasing computing power is driving the core of …
Recent advances in deep learning for speech research at Microsoft
Deep learning is becoming a mainstream technology for speech recognition at industrial
scale. In this paper, we provide an overview of the work by Microsoft speech researchers …
scale. In this paper, we provide an overview of the work by Microsoft speech researchers …
PipeDream: Generalized pipeline parallelism for DNN training
DNN training is extremely time-consuming, necessitating efficient multi-accelerator
parallelization. Current approaches to parallelizing training primarily use intra-batch …
parallelization. Current approaches to parallelizing training primarily use intra-batch …
Gpipe: Efficient training of giant neural networks using pipeline parallelism
Scaling up deep neural network capacity has been known as an effective approach to
improving model quality for several different machine learning tasks. In many cases …
improving model quality for several different machine learning tasks. In many cases …
[图书][B] Automatic speech recognition
Automatic Speech Recognition (ASR), which is aimed to enable natural human–machine
interaction, has been an intensive research area for decades. Many core technologies, such …
interaction, has been an intensive research area for decades. Many core technologies, such …
[PDF][PDF] 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs.
We show empirically that in SGD training of deep neural networks, one can, at no or nearly
no loss of accuracy, quantize the gradients aggressively—to but one bit per value—if the …
no loss of accuracy, quantize the gradients aggressively—to but one bit per value—if the …
[HTML][HTML] Scalable distributed DNN training using commodity GPU cloud computing
N Ström - 2015 - amazon.science
We introduce a new method for scaling up distributed Stochastic Gradient Descent (SGD)
training of Deep Neural Networks (DNN). The method solves the well-known communication …
training of Deep Neural Networks (DNN). The method solves the well-known communication …
Deep learning: methods and applications
This monograph provides an overview of general deep learning methodology and its
applications to a variety of signal and information processing tasks. The application areas …
applications to a variety of signal and information processing tasks. The application areas …
Pipedream: Fast and efficient pipeline parallel dnn training
PipeDream is a Deep Neural Network (DNN) training system for GPUs that parallelizes
computation by pipelining execution across multiple machines. Its pipeline parallel …
computation by pipelining execution across multiple machines. Its pipeline parallel …
Data movement is all you need: A case study on optimizing transformers
Transformers are one of the most important machine learning workloads today. Training one
is a very compute-intensive task, often taking days or weeks, and significant attention has …
is a very compute-intensive task, often taking days or weeks, and significant attention has …