Performance analysis of CNN frameworks for GPUs

H Kim, H Nam, W Jung, J Lee - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
H Kim, H Nam, W Jung, J Lee
2017 IEEE International Symposium on Performance Analysis of …, 2017ieeexplore.ieee.org
Thanks to modern deep learning frameworks that exploit GPUs, convolutional neural
networks (CNNs) have been greatly successful in visual recognition tasks. In this paper, we
analyze the GPU performance characteristics of five popular deep learning frameworks:
Caffe, CNTK, TensorFlow, Theano, and Torch in the perspective of a representative CNN
model, AlexNet. Based on the characteristics obtained, we suggest possible optimization
methods to increase the efficiency of CNN models built by the frameworks. We also show the …
Thanks to modern deep learning frameworks that exploit GPUs, convolutional neural networks (CNNs) have been greatly successful in visual recognition tasks. In this paper, we analyze the GPU performance characteristics of five popular deep learning frameworks: Caffe, CNTK, TensorFlow, Theano, and Torch in the perspective of a representative CNN model, AlexNet. Based on the characteristics obtained, we suggest possible optimization methods to increase the efficiency of CNN models built by the frameworks. We also show the GPU performance characteristics of different convolution algorithms each of which uses one of GEMM, direct convolution, FFT, and the Winograd method. We also suggest criteria to choose convolution algorithms for GPUs and methods to build efficient CNN models on GPUs. Since scaling DNNs in a multi-GPU context becomes increasingly important, we also analyze the scalability of the CNN models built by the deep learning frameworks in the multi-GPU context and their overhead. The result indicates that we can increase the speed of training the AlexNet model up to 2X by just changing options provided by the frameworks.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果