Data prefetch mechanisms
SP Vanderwiel, DJ Lilja - ACM Computing Surveys (CSUR), 2000 - dl.acm.org
The expanding gap between microprocessor and DRAM performance has necessitated the
use of increasingly aggressive techniques designed to reduce or hide the latency of main …
use of increasingly aggressive techniques designed to reduce or hide the latency of main …
Canvas: Isolated and adaptive swapping for {Multi-Applications} on remote memory
Remote memory techniques for datacenter applications have recently gained a great deal of
popularity. Existing remote memory techniques focus on the efficiency of a single application …
popularity. Existing remote memory techniques focus on the efficiency of a single application …
[图书][B] Data access and storage management for embedded programmable processors
F Catthoor, K Danckaert - 2002 - books.google.com
Data Access and Storage Management for Embedded Programmable Processors gives an
overview of the state-of-the-art in system-level data access and storage management for …
overview of the state-of-the-art in system-level data access and storage management for …
Improving docker registry design based on production workload analysis
Containers offer an efficient way to run workloads as independent microservices that can be
developed, tested and deployed in an agile manner. To facilitate this process, container …
developed, tested and deployed in an agile manner. To facilitate this process, container …
Fine-grained address segmentation for attention-based variable-degree prefetching
Machine learning algorithms have shown potential to improve prefetching performance by
accurately predicting future memory accesses. Existing approaches are based on the …
accurately predicting future memory accesses. Existing approaches are based on the …
Scope-aware data cache analysis for WCET estimation
BK Huynh, L Ju, A Roychoudhury - 2011 17th IEEE Real-Time …, 2011 - ieeexplore.ieee.org
Caches are widely used in modern computer systems to bridge the increasing gap between
processor speed and memory access time. On the other hand, presence of caches …
processor speed and memory access time. On the other hand, presence of caches …
Snake: A variable-length chain-based prefetching for gpus
Graphics Processing Units (GPUs) utilize memory hierarchy and Thread-Level Parallelism
(TLP) to tolerate off-chip memory latency, which is a significant bottleneck for memory-bound …
(TLP) to tolerate off-chip memory latency, which is a significant bottleneck for memory-bound …
Resemble: reinforced ensemble framework for data prefetching
Data prefetching hides memory latency by predicting and loading necessary data into cache
beforehand. Most prefetchers in the literature are efficient for specific memory address …
beforehand. Most prefetchers in the literature are efficient for specific memory address …
Raop: Recurrent neural network augmented offset prefetcher
The rapid development of Big Data coupled with slowing down of Moore's law has made the
memory performance a bottleneck in the von Neumann architecture. Machine learning has …
memory performance a bottleneck in the von Neumann architecture. Machine learning has …
Sharp: Software hint-assisted memory access prediction for graph analytics
Memory system performance is a major bottleneck in large-scale graph analytics. Data
prefetching can hide memory latency; this relies on accurate prediction of memory accesses …
prefetching can hide memory latency; this relies on accurate prediction of memory accesses …