Mapping parallelism to multi-cores: a machine learning based approach
Z Wang, MFP O'Boyle - Proceedings of the 14th ACM SIGPLAN …, 2009 - dl.acm.org
The efficient mapping of program parallelism to multi-core processors is highly dependent
on the underlying architecture. This paper proposes a portable and automatic compiler …
on the underlying architecture. This paper proposes a portable and automatic compiler …
A framework for end-to-end simulation of high-performance computing systems
We present an end-to-end simulation framework that is capable of simulating High-
Performance Computing (HPC) systems with hundreds of thousands of interconnected …
Performance Computing (HPC) systems with hundreds of thousands of interconnected …
Characteristics of workloads used in high performance and technical computing
R Cheveresan, M Ramsay, C Feucht… - Proceedings of the 21st …, 2007 - dl.acm.org
This paper provides a systematic comparison of various characteristics of computationally-
intensive workloads. Our analysis focuses on standard HPC benchmarks and representative …
intensive workloads. Our analysis focuses on standard HPC benchmarks and representative …
An instrumentation approach for hardware-agnostic software characterization
A Anghel, LM Vasilescu, R Jongerius… - Proceedings of the 12th …, 2015 - dl.acm.org
Simulators and empirical profiling data are often used to understand how suitable a specific
hardware architecture is for an application. However, simulators can be slow, and empirical …
hardware architecture is for an application. However, simulators can be slow, and empirical …
Predicting HPC parallel program performance based on LLVM compiler
Performance prediction of parallel program plays key roles in many areas, such as parallel
system design, parallel program optimization, and parallel system procurement. Accurate …
system design, parallel program optimization, and parallel system procurement. Accurate …
Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems
We explore runtime mechanisms and policies for scheduling dynamic multi-grain parallelism
on heterogeneous multi-core processors. Heterogeneous multi-core processors integrate …
on heterogeneous multi-core processors. Heterogeneous multi-core processors integrate …
Modeling multigrain parallelism on heterogeneous multi-core processors: A case study of the Cell BE
Heterogeneous multi-core processors invest the most significant portion of their transistor
budget in customized “accelerator” cores, while using a small number of conventional low …
budget in customized “accelerator” cores, while using a small number of conventional low …
Software probes: Towards a quick method for machine characterization and application performance prediction
Computers perform different applications in different ways. To characterize an application
performance into a machine, the usual method is a throughout execution of it. This work is a …
performance into a machine, the usual method is a throughout execution of it. This work is a …
[图书][B] A compile-time OpenMP cost model
C Liao - 2007 - search.proquest.com
OpenMP is a de facto API for parallel programming in C/C++ and Fortran on shared memory
and distributed shared memory platforms. It is also being increasingly used with MPI to form …
and distributed shared memory platforms. It is also being increasingly used with MPI to form …
Scaling application properties to exascale
Exascale computing systems will execute computationally intensive tasks on unprecedented
amounts of data. Tuning the design of such systems for a specific application or for an …
amounts of data. Tuning the design of such systems for a specific application or for an …