Bingo spatial data prefetcher
M Bakhshalipour, M Shakerinava… - … Symposium on High …, 2019 - ieeexplore.ieee.org
Applications extensively use data objects with a regular and fixed layout, which leads to the
recurrence of access patterns over memory regions. Spatial data prefetching techniques …
recurrence of access patterns over memory regions. Spatial data prefetching techniques …
Unison cache: A scalable and effective die-stacked DRAM cache
Recent research advocates large die-stacked DRAM caches in many core servers to break
the memory latency and bandwidth wall. To realize their full potential, die-stacked DRAM …
the memory latency and bandwidth wall. To realize their full potential, die-stacked DRAM …
The mondrian data engine
The increasing demand for extracting value out of ever-growing data poses an ongoing
challenge to system designers, a task only made trickier by the end of Dennard scaling. As …
challenge to system designers, a task only made trickier by the end of Dennard scaling. As …
Mempod: A clustered architecture for efficient and scalable migration in flat address space multi-level memories
A Prodromou, M Meswani, N Jayasena… - … Symposium on High …, 2017 - ieeexplore.ieee.org
In the near future, die-stacked DRAM will be increasingly present in conjunction with off-chip
memories in hybrid memory systems. Research on this subject revolves around using the …
memories in hybrid memory systems. Research on this subject revolves around using the …
Pageseer: Using page walks to trigger page swaps in hybrid memory systems
A Kokolis, D Skarlatos… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Hybrid main memories composed of DRAM and NonVolatile Memory (NVM) combine the
capacity benefits of NVM with the low-latency properties of DRAM. For highest performance …
capacity benefits of NVM with the low-latency properties of DRAM. For highest performance …
A performance & power comparison of modern high-speed dram architectures
To feed the high degrees of parallelism in modern graphics processors and manycore CPU
designs, DRAM manufacturers have created new DRAM architectures that deliver high …
designs, DRAM manufacturers have created new DRAM architectures that deliver high …
A case for richer cross-layer abstractions: Bridging the semantic gap with expressive memory
N Vijaykumar, A Jain, D Majumdar… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
This paper makes a case for a new cross-layer interface, Expressive Memory (XMem), to
communicate higher-level program semantics from the application to the system software …
communicate higher-level program semantics from the application to the system software …
Predicting the memory bandwidth and optimal core allocations for multi-threaded applications on large-scale numa machines
Modern NUMA platforms offer large numbers of cores to boost performance through
parallelism and multi-threading. However, because performance scalability is limited by …
parallelism and multi-threading. However, because performance scalability is limited by …
Efficient footprint caching for tagless dram caches
Efficient cache tag management is a primary design objective for large, in-package DRAM
caches. Recently, Tagless DRAM Caches (TDCs) have been proposed to completely …
caches. Recently, Tagless DRAM Caches (TDCs) have been proposed to completely …
Sort vs. hash join revisited for near-memory execution
Data movement between memory and CPU is a well-known energy bottleneck for analytics.
Near-Memory Processing (NMP) is a promising approach for eliminating this bottleneck by …
Near-Memory Processing (NMP) is a promising approach for eliminating this bottleneck by …