Using prime numbers for cache indexing to eliminate conflict misses

D Chandra, F Guo, S Kim… - … Symposium on High …, 2005 - ieeexplore.ieee.org

This paper studies the impact of L2 cache sharing on threads that simultaneously share the
cache, on a chip multi-processor (CMP) architecture. Cache sharing impacts threads …

被引用次数：756 相关文章所有 12 个版本

[PDF] gatech.edu

The ZCache: Decoupling ways and associativity

D Sanchez, C Kozyrakis - 2010 43rd Annual IEEE/ACM …, 2010 - ieeexplore.ieee.org

The ever-increasing importance of main memory latency and bandwidth is pushing CMPs
towards caches with higher capacity and associativity. Associativity is typically improved by …

被引用次数：284 相关文章所有 23 个版本

[PDF] utexas.edu

The V-Way cache: demand-based associativity via global replacement

MK Qureshi, D Thompson… - … Symposium on Computer …, 2005 - ieeexplore.ieee.org

As processor speeds increase and memory latency becomes more critical, intelligent design
and management of secondary caches becomes increasingly important. The efficiency of …

被引用次数：326 相关文章所有 20 个版本

[PDF] mit.edu

Talus: A simple way to remove cliffs in cache performance

N Beckmann, D Sanchez - 2015 IEEE 21st International …, 2015 - ieeexplore.ieee.org

Caches often suffer from performance cliffs: minor changes in program behavior or available
cache space cause large changes in miss rate. Cliffs hurt performance and complicate …

被引用次数：103 相关文章所有 12 个版本

The bunker cache for spatio-value approximation

J San Miguel, J Albericio, NE Jerger… - 2016 49th Annual IEEE …, 2016 - ieeexplore.ieee.org

The cost of moving and storing data is still a fundamental concern for computer architects.
Inefficient handling of data can be attributed to conventional architectures being oblivious to …

被引用次数：71 相关文章所有 2 个版本

[PDF] oregonstate.edu

Futility scaling: High-associativity cache partitioning

R Wang, L Chen - 2014 47th Annual IEEE/ACM International …, 2014 - ieeexplore.ieee.org

As shared last level caches are widely used in many-core CMPs to boost system
performance, partitioning a large shared cache among multiple concurrently running …

被引用次数：76 相关文章所有 6 个版本

[PDF] mit.edu

Modeling cache performance beyond LRU

N Beckmann, D Sanchez - 2016 IEEE International Symposium …, 2016 - ieeexplore.ieee.org

Modern processors use high-performance cache replacement policies that outperform
traditional alternatives like least-recently used (LRU). Unfortunately, current cache models …

被引用次数：72 相关文章所有 12 个版本

[PDF] googleapis.com

TLB tag parity checking without CAM read

MA Luttrell, PJ Jordan - US Patent 7,366,829, 2008 - Google Patents

5,596,293 A 1/1997 Rogers et al. access operations is described in connection with a multi
5,712,791 A 1/1998 Lauterbach................. 364,489 threaded multiprocessor chip. This parity …

被引用次数：105 相关文章所有 2 个版本

[PDF] academia.edu

Adaptive line placement with the set balancing cache

D Rolán, BB Fraguela, R Doallo - Proceedings of the 42nd Annual IEEE …, 2009 - dl.acm.org

Efficient memory hierarchy design is critical due to the increasing gap between the speed of
the processors and the memory. One of the sources of inefficiency in current caches is the …

被引用次数：97 相关文章所有 11 个版本

XOR-based hash functions

H Vandierendonck… - IEEE Transactions on …, 2005 - ieeexplore.ieee.org

Bank conflicts can severely reduce the bandwidth of an interleaved multibank memory and
conflict misses increase the miss rate of a cache or a predictor. Both occurrences are …

被引用次数：100 相关文章所有 7 个版本