A survey of architectural techniques for improving cache power efficiency

S Mittal - Sustainable Computing: Informatics and Systems, 2014 - Elsevier
Modern processors are using increasingly larger sized on-chip caches. Also, with each
CMOS technology generation, there has been a significant increase in their leakage energy …

Practical near-data processing for in-memory analytics frameworks

M Gao, G Ayers, C Kozyrakis - 2015 International Conference …, 2015 - ieeexplore.ieee.org
The end of Dennard scaling has made all systemsenergy-constrained. For data-intensive
applications with limitedtemporal locality, the major energy bottleneck is data …

TOP-PIM: Throughput-oriented programmable processing in memory

D Zhang, N Jayasena, A Lyashevsky… - Proceedings of the 23rd …, 2014 - dl.acm.org
As computation becomes increasingly limited by data movement and energy consumption,
exploiting locality throughout the memory hierarchy becomes critical to continued …

HRL: Efficient and flexible reconfigurable logic for near-data processing

M Gao, C Kozyrakis - 2016 IEEE International Symposium on …, 2016 - ieeexplore.ieee.org
The energy constraints due to the end of Dennard scaling, the popularity of in-memory
analytics, and the advances in 3D integration technology have led to renewed interest in …

Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories

MR Meswani, S Blagodurov, D Roberts… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
Die-stacked DRAM is a technology that will soon be integrated in high-performance
systems. Recent studies have focused on hardware caching techniques to make use of the …

Bingo spatial data prefetcher

M Bakhshalipour, M Shakerinava… - … Symposium on High …, 2019 - ieeexplore.ieee.org
Applications extensively use data objects with a regular and fixed layout, which leads to the
recurrence of access patterns over memory regions. Spatial data prefetching techniques …

Unison cache: A scalable and effective die-stacked DRAM cache

D Jevdjic, GH Loh, C Kaynak… - 2014 47th Annual IEEE …, 2014 - ieeexplore.ieee.org
Recent research advocates large die-stacked DRAM caches in many core servers to break
the memory latency and bandwidth wall. To realize their full potential, die-stacked DRAM …

Cameo: A two-level memory organization with capacity of main memory and flexibility of hardware-managed cache

CC Chou, A Jaleel, MK Qureshi - 2014 47th Annual IEEE/ACM …, 2014 - ieeexplore.ieee.org
This paper analyzes the trade-offs in architecting stacked DRAM either as part of main
memory or as a hardware-managed cache. Using stacked DRAM as part of main memory …

A survey of techniques for architecting DRAM caches

S Mittal, JS Vetter - IEEE Transactions on Parallel and …, 2015 - ieeexplore.ieee.org
Recent trends of increasing core-count and memory/bandwidth-wall have led to major
overhauls in chip architecture. In face of increasing cache capacity demands, researchers …

Domino temporal data prefetcher

M Bakhshalipour, P Lotfi-Kamran… - … Symposium on High …, 2018 - ieeexplore.ieee.org
Big-data server applications frequently encounter data misses, and hence, lose significant
performance potential. One way to reduce the number of data misses or their effect is data …