Predicting inter-thread cache contention on a chip multi-processor architecture

D Chandra, F Guo, S Kim… - … Symposium on High …, 2005 - ieeexplore.ieee.org
This paper studies the impact of L2 cache sharing on threads that simultaneously share the
cache, on a chip multi-processor (CMP) architecture. Cache sharing impacts threads …

The ZCache: Decoupling ways and associativity

D Sanchez, C Kozyrakis - 2010 43rd Annual IEEE/ACM …, 2010 - ieeexplore.ieee.org
The ever-increasing importance of main memory latency and bandwidth is pushing CMPs
towards caches with higher capacity and associativity. Associativity is typically improved by …

The V-Way cache: demand-based associativity via global replacement

MK Qureshi, D Thompson… - … Symposium on Computer …, 2005 - ieeexplore.ieee.org
As processor speeds increase and memory latency becomes more critical, intelligent design
and management of secondary caches becomes increasingly important. The efficiency of …

Talus: A simple way to remove cliffs in cache performance

N Beckmann, D Sanchez - 2015 IEEE 21st International …, 2015 - ieeexplore.ieee.org
Caches often suffer from performance cliffs: minor changes in program behavior or available
cache space cause large changes in miss rate. Cliffs hurt performance and complicate …

The bunker cache for spatio-value approximation

J San Miguel, J Albericio, NE Jerger… - 2016 49th Annual IEEE …, 2016 - ieeexplore.ieee.org
The cost of moving and storing data is still a fundamental concern for computer architects.
Inefficient handling of data can be attributed to conventional architectures being oblivious to …

Futility scaling: High-associativity cache partitioning

R Wang, L Chen - 2014 47th Annual IEEE/ACM International …, 2014 - ieeexplore.ieee.org
As shared last level caches are widely used in many-core CMPs to boost system
performance, partitioning a large shared cache among multiple concurrently running …

Modeling cache performance beyond LRU

N Beckmann, D Sanchez - 2016 IEEE International Symposium …, 2016 - ieeexplore.ieee.org
Modern processors use high-performance cache replacement policies that outperform
traditional alternatives like least-recently used (LRU). Unfortunately, current cache models …

TLB tag parity checking without CAM read

MA Luttrell, PJ Jordan - US Patent 7,366,829, 2008 - Google Patents
5,596,293 A 1/1997 Rogers et al. access operations is described in connection with a multi
5,712,791 A 1/1998 Lauterbach................. 364,489 threaded multiprocessor chip. This parity …

Adaptive line placement with the set balancing cache

D Rolán, BB Fraguela, R Doallo - Proceedings of the 42nd Annual IEEE …, 2009 - dl.acm.org
Efficient memory hierarchy design is critical due to the increasing gap between the speed of
the processors and the memory. One of the sources of inefficiency in current caches is the …

XOR-based hash functions

H Vandierendonck… - IEEE Transactions on …, 2005 - ieeexplore.ieee.org
Bank conflicts can severely reduce the bandwidth of an interleaved multibank memory and
conflict misses increase the miss rate of a cache or a predictor. Both occurrences are …