DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
Pythia: A customizable hardware prefetching framework using online reinforcement learning
Past research has proposed numerous hardware prefetching techniques, most of which rely
on exploiting one specific type of program context information (eg, program counter …
on exploiting one specific type of program context information (eg, program counter …
Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation
Pointer chasing is a fundamental operation, used by many important data-intensive
applications (eg, databases, key-value stores, graph processing workloads) to traverse …
applications (eg, databases, key-value stores, graph processing workloads) to traverse …
EDEN: Enabling energy-efficient, high-performance deep neural network inference using approximate DRAM
The effectiveness of deep neural networks (DNN) in vision, speech, and language
processing has prompted a tremendous demand for energy-efficient high-performance DNN …
processing has prompted a tremendous demand for energy-efficient high-performance DNN …
Understanding reduced-voltage operation in modern DRAM devices: Experimental characterization, analysis, and mechanisms
The energy consumption of DRAM is a critical concern in modern computing systems.
Improvements in manufacturing process technology have allowed DRAM vendors to lower …
Improvements in manufacturing process technology have allowed DRAM vendors to lower …
NATSA: a near-data processing accelerator for time series analysis
Time series analysis is a key technique for extracting and predicting events in domains as
diverse as epidemiology, genomics, neuroscience, environmental sciences, economics, and …
diverse as epidemiology, genomics, neuroscience, environmental sciences, economics, and …
Feedback directed prefetching: Improving the performance and bandwidth-efficiency of hardware prefetchers
High performance processors employ hardware data prefetching to reduce the negative
performance impact of large main memory latencies. While prefetching improves …
performance impact of large main memory latencies. While prefetching improves …
[PDF][PDF] Research problems and opportunities in memory systems
O Mutlu, L Subramanian - Supercomputing frontiers and …, 2014 - superfri.susu.ru
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …
computing systems. Recent system design, application, and technology trends that require …
ChargeCache: Reducing DRAM latency by exploiting row access locality
DRAM latency continues to be a critical bottleneck for system performance. In this work, we
develop a low-cost mechanism, called Charge Cache, that enables faster access to recently …
develop a low-cost mechanism, called Charge Cache, that enables faster access to recently …
Evaluation of hardware data prefetchers on server processors
M Bakhshalipour, S Tabaeiaghdaei… - ACM Computing …, 2019 - dl.acm.org
Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …
fetching those that are not in the on-chip caches, is a well-known and widely used approach …