Bingo spatial data prefetcher

M Bakhshalipour, M Shakerinava… - … Symposium on High …, 2019 - ieeexplore.ieee.org
Applications extensively use data objects with a regular and fixed layout, which leads to the
recurrence of access patterns over memory regions. Spatial data prefetching techniques …

Unison cache: A scalable and effective die-stacked DRAM cache

D Jevdjic, GH Loh, C Kaynak… - 2014 47th Annual IEEE …, 2014 - ieeexplore.ieee.org
Recent research advocates large die-stacked DRAM caches in many core servers to break
the memory latency and bandwidth wall. To realize their full potential, die-stacked DRAM …

The mondrian data engine

M Drumond, A Daglis, N Mirzadeh, D Ustiugov… - ACM SIGARCH …, 2017 - dl.acm.org
The increasing demand for extracting value out of ever-growing data poses an ongoing
challenge to system designers, a task only made trickier by the end of Dennard scaling. As …

Mempod: A clustered architecture for efficient and scalable migration in flat address space multi-level memories

A Prodromou, M Meswani, N Jayasena… - … Symposium on High …, 2017 - ieeexplore.ieee.org
In the near future, die-stacked DRAM will be increasingly present in conjunction with off-chip
memories in hybrid memory systems. Research on this subject revolves around using the …

Pageseer: Using page walks to trigger page swaps in hybrid memory systems

A Kokolis, D Skarlatos… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Hybrid main memories composed of DRAM and NonVolatile Memory (NVM) combine the
capacity benefits of NVM with the low-latency properties of DRAM. For highest performance …

A performance & power comparison of modern high-speed dram architectures

S Li, D Reddy, B Jacob - Proceedings of the International Symposium on …, 2018 - dl.acm.org
To feed the high degrees of parallelism in modern graphics processors and manycore CPU
designs, DRAM manufacturers have created new DRAM architectures that deliver high …

A case for richer cross-layer abstractions: Bridging the semantic gap with expressive memory

N Vijaykumar, A Jain, D Majumdar… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
This paper makes a case for a new cross-layer interface, Expressive Memory (XMem), to
communicate higher-level program semantics from the application to the system software …

Predicting the memory bandwidth and optimal core allocations for multi-threaded applications on large-scale numa machines

W Wang, JW Davidson, ML Soffa - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
Modern NUMA platforms offer large numbers of cores to boost performance through
parallelism and multi-threading. However, because performance scalability is limited by …

Efficient footprint caching for tagless dram caches

H Jang, Y Lee, J Kim, Y Kim, J Kim… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Efficient cache tag management is a primary design objective for large, in-package DRAM
caches. Recently, Tagless DRAM Caches (TDCs) have been proposed to completely …

Sort vs. hash join revisited for near-memory execution

NS Mirzadeh, O Kocberber, B Falsafi… - Fifth Workshop on …, 2015 - research.ed.ac.uk
Data movement between memory and CPU is a well-known energy bottleneck for analytics.
Near-Memory Processing (NMP) is a promising approach for eliminating this bottleneck by …