作者
Ali Karami, Farshad Khunjush, Seyyed Ali Mirsoleimani
发表日期
2015/8
期刊
The Journal of Supercomputing
卷号
71
页码范围
2900-2921
出版商
Springer US
简介
Understanding performance bottlenecks of applications in high-performance computing can lead to dramatic improvements in their performances. For example, a key problem in GPU programming is finding performance bottlenecks and solving them to reach the best possible performance. These bottlenecks in GPU architectures include a number of factors such as memory access latency, branch divergence, utilization, and the amount of existing parallelism. In addition, a simple profiling cannot demonstrate the relations between these bottlenecks. In this paper, we propose a statistical performance analyzer framework that not only helps us find bottlenecks, but also indicates the relations between them, which is not possible using a profiler. Recently, OpenCL has been proposed to be used in a variety of platforms, e.g., CPUs and GPUs, enabling a program written in one platform to be imported to other …
引用总数
201520162017201820192020202120221143322
学术搜索中的文章