A survey of machine learning for computer architecture and systems
It has been a long time that computer architecture and systems are optimized for efficient
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …
Research challenges in parallel and distributed simulation
RM Fujimoto - ACM Transactions on Modeling and Computer …, 2016 - dl.acm.org
The parallel and distributed simulation field has evolved and grown from its origins in the
1970s and 1980s and remains an active field of research to this day. A brief overview of …
1970s and 1980s and remains an active field of research to this day. A brief overview of …
Machine learning in compiler optimization
In the last decade, machine-learning-based compilation has moved from an obscure
research niche to a mainstream activity. In this paper, we describe the relationship between …
research niche to a mainstream activity. In this paper, we describe the relationship between …
Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes
R Rabenseifner, G Hager, G Jost - 2009 17th Euromicro …, 2009 - ieeexplore.ieee.org
Today most systems in high-performance computing (HPC) feature a hierarchical hardware
design: Shared memory nodes with several multi-core CPUs are connected via a network …
design: Shared memory nodes with several multi-core CPUs are connected via a network …
Selecting stars: The k most representative skyline operator
Skyline computation has many applications including multi-criteria decision making. In this
paper, we study the problem of selecting k skyline points so that the number of points, which …
paper, we study the problem of selecting k skyline points so that the number of points, which …
Exploring hardware overprovisioning in power-constrained, high performance computing
Most recent research in power-aware supercomputing has focused on making individual
nodes more efficient and measuring the results in terms of flops per watt. While this work is …
nodes more efficient and measuring the results in terms of flops per watt. While this work is …
A simplified and accurate model of power-performance efficiency on emergent GPU architectures
S Song, C Su, B Rountree… - 2013 IEEE 27th …, 2013 - ieeexplore.ieee.org
Emergent heterogeneous systems must be optimized for both power and performance at
exascale. Massive parallelism combined with complex memory hierarchies form a barrier to …
exascale. Massive parallelism combined with complex memory hierarchies form a barrier to …
Hybrid MPI/OpenMP power-aware computing
Power-aware execution of parallel programs is now a primary concern in large-scale HPC
environments. Prior research in this area has explored models and algorithms based on …
environments. Prior research in this area has explored models and algorithms based on …
Predicting performance impact of DVFS for realistic memory systems
R Miftakhutdinov, E Ebrahimi… - 2012 45th Annual IEEE …, 2012 - ieeexplore.ieee.org
Dynamic voltage and frequency scaling (DVFS) can make modern processors more power
and energy efficient if we can accurately predict the effect of frequency scaling on processor …
and energy efficient if we can accurately predict the effect of frequency scaling on processor …
A reconfiguration algorithm for power-aware parallel applications
In current computing systems, many applications require guarantees on their maximum
power consumption to not exceed the available power budget. On the other hand, for some …
power consumption to not exceed the available power budget. On the other hand, for some …