HOTL: a higher order theory of locality
The locality metrics are many, for example, miss ratio to test performance, data footprint to
manage cache sharing, and reuse distance to analyze and optimize a program. It is unclear …
manage cache sharing, and reuse distance to analyze and optimize a program. It is unclear …
{LAMA}: Optimized locality-aware memory allocation for key-value cache
The in-memory cache system is a performance-critical layer in today's web server
architecture. Memcached is one of the most effective, representative, and prevalent among …
architecture. Memcached is one of the most effective, representative, and prevalent among …
Understanding and exploiting spatial properties of system failures on extreme-scale hpc systems
S Gupta, D Tiwari, C Jantzi, J Rogers… - 2015 45th Annual …, 2015 - ieeexplore.ieee.org
As we approach exascale, the scientific simulations are expected to experience more
interruptions due to increased system failures. Designing better HPC resilience techniques …
interruptions due to increased system failures. Designing better HPC resilience techniques …
Codestitcher: inter-procedural basic block layout optimization
Modern software executes a large amount of code. Previous techniques of code layout
optimization were developed one or two decades ago and have become inadequate to cope …
optimization were developed one or two decades ago and have become inadequate to cope …
A relational theory of locality
In many areas of program and system analysis and optimization, locality is a common
concept and has been defined and measured in many ways. This article aims to formally …
concept and has been defined and measured in many ways. This article aims to formally …
Thread data sharing in cache: Theory and measurement
On modern multi-core processors, independent workloads often interfere with each other by
competing for shared cache space. However, for multi-threaded workloads, where a single …
competing for shared cache space. However, for multi-threaded workloads, where a single …
Spatial locality-aware cache partitioning for effective cache sharing
In modern multi-core processors, last-level caches (LLCs) are typically shared among
multiple cores. Previous works have shown that such sharing is beneficial as different …
multiple cores. Previous works have shown that such sharing is beneficial as different …
Performance metrics and models for shared cache
Performance metrics and models are prerequisites for scientific understanding and
optimization. This paper introduces a new footprint-based theory and reviews the research …
optimization. This paper introduces a new footprint-based theory and reviews the research …
Analytical miss rate calculation of L2 cache from the RD profile of L1 cache
JM Sabarimuthu, TG Venkatesh - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Reuse distance is an important metric for analytical estimation of cache miss rate. To find the
miss rate of a particular cache, the reuse distance profile has to be measured for that …
miss rate of a particular cache, the reuse distance profile has to be measured for that …
LPM: concurrency-driven layered performance matching
YH Liu, XH Sun - 2015 44th International Conference on …, 2015 - ieeexplore.ieee.org
Data access has become the preeminent performance bottleneck of computing. In this study,
a Layered Performance Matching (LPM) model and its associated algorithm are proposed to …
a Layered Performance Matching (LPM) model and its associated algorithm are proposed to …