DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

Pythia: A customizable hardware prefetching framework using online reinforcement learning

R Bera, K Kanellopoulos, A Nori, T Shahroodi… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
Past research has proposed numerous hardware prefetching techniques, most of which rely
on exploiting one specific type of program context information (eg, program counter …

Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation

K Hsieh, S Khan, N Vijaykumar… - 2016 IEEE 34th …, 2016 - ieeexplore.ieee.org
Pointer chasing is a fundamental operation, used by many important data-intensive
applications (eg, databases, key-value stores, graph processing workloads) to traverse …

EDEN: Enabling energy-efficient, high-performance deep neural network inference using approximate DRAM

S Koppula, L Orosa, AG Yağlıkçı, R Azizi… - Proceedings of the …, 2019 - dl.acm.org
The effectiveness of deep neural networks (DNN) in vision, speech, and language
processing has prompted a tremendous demand for energy-efficient high-performance DNN …

Understanding reduced-voltage operation in modern DRAM devices: Experimental characterization, analysis, and mechanisms

KK Chang, AG Yağlıkçı, S Ghose, A Agrawal… - Proceedings of the …, 2017 - dl.acm.org
The energy consumption of DRAM is a critical concern in modern computing systems.
Improvements in manufacturing process technology have allowed DRAM vendors to lower …

NATSA: a near-data processing accelerator for time series analysis

I Fernandez, R Quislant, E Gutiérrez… - 2020 IEEE 38th …, 2020 - ieeexplore.ieee.org
Time series analysis is a key technique for extracting and predicting events in domains as
diverse as epidemiology, genomics, neuroscience, environmental sciences, economics, and …

Feedback directed prefetching: Improving the performance and bandwidth-efficiency of hardware prefetchers

S Srinath, O Mutlu, H Kim… - 2007 IEEE 13th …, 2007 - ieeexplore.ieee.org
High performance processors employ hardware data prefetching to reduce the negative
performance impact of large main memory latencies. While prefetching improves …

[PDF][PDF] Research problems and opportunities in memory systems

O Mutlu, L Subramanian - Supercomputing frontiers and …, 2014 - superfri.susu.ru
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …

ChargeCache: Reducing DRAM latency by exploiting row access locality

H Hassan, G Pekhimenko, N Vijaykumar… - … Symposium on High …, 2016 - ieeexplore.ieee.org
DRAM latency continues to be a critical bottleneck for system performance. In this work, we
develop a low-cost mechanism, called Charge Cache, that enables faster access to recently …

Evaluation of hardware data prefetchers on server processors

M Bakhshalipour, S Tabaeiaghdaei… - ACM Computing …, 2019 - dl.acm.org
Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …