Bingo spatial data prefetcher

R Bera, K Kanellopoulos, A Nori, T Shahroodi… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Past research has proposed numerous hardware prefetching techniques, most of which rely
on exploiting one specific type of program context information (eg, program counter …

被引用次数：73 相关文章所有 7 个版本

[PDF] acm.org

A hierarchical neural model of data prefetching

Z Shi, A Jain, K Swersky, M Hashemi… - Proceedings of the 26th …, 2021 - dl.acm.org

This paper presents Voyager, a novel neural network for data prefetching. Unlike previous
neural models for prefetching, which are limited to learning delta correlations, our model can …

被引用次数：81 相关文章所有 7 个版本

[PDF] arxiv.org

The championship simulator: Architectural simulation for education and competition

N Gober, G Chacon, L Wang, PV Gratz… - arXiv preprint arXiv …, 2022 - arxiv.org

Recent years have seen a dramatic increase in the microarchitectural complexity of
processors. This increase in complexity presents a twofold challenge for the field of …

被引用次数：43 相关文章所有 2 个版本

[PDF] ed.ac.uk

Prodigy: Improving the memory latency of data-indirect irregular workloads using hardware-software co-design

N Talati, K May, A Behroozi, Y Yang… - … Symposium on High …, 2021 - ieeexplore.ieee.org

Irregular workloads are typically bottlenecked by the memory system. These workloads often
use sparse data representations, eg, compressed sparse row/column (CSR/CSC), to …

被引用次数：62 相关文章所有 9 个版本

Evaluation of hardware data prefetchers on server processors

M Bakhshalipour, S Tabaeiaghdaei… - ACM Computing …, 2019 - dl.acm.org

Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …

被引用次数：34 相关文章

[PDF] cam.ac.uk

Decoupled vector runahead

A Naithani, J Roelandts, S Ainsworth… - Proceedings of the 56th …, 2023 - dl.acm.org

We present Decoupled Vector Runahead (DVR), an in-core prefetching technique,
executing separately to the main application thread, that exploits massive amounts of …

被引用次数：8 相关文章所有 9 个版本

A survey on pcm lifetime enhancement schemes

S Rashidi, M Jalili, H Sarbazi-Azad - ACM Computing Surveys (CSUR), 2019 - dl.acm.org

Phase Change Memory (PCM) is an emerging memory technology that has the capability to
address the growing demand for memory capacity and bridge the gap between the main …

被引用次数：21 相关文章所有 2 个版本

[PDF] arxiv.org

Hermes: Accelerating long-latency load requests via perceptron-based off-chip load prediction

R Bera, K Kanellopoulos… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

Long-latency load requests continue to limit the performance of modern high-performance
processors. To increase the latency tolerance of a processor, architects have primarily relied …

被引用次数：21 相关文章所有 7 个版本

[PDF] acm.org

APT-GET: profile-guided timely software prefetching

S Jamilan, TA Khan, G Ayers, B Kasikci… - Proceedings of the …, 2022 - dl.acm.org

Prefetching which predicts future memory accesses and preloads them from main memory,
is a widely-adopted technique to overcome the processor-memory performance gap …

被引用次数：27 相关文章所有 7 个版本

Spaghetti: Streaming accelerators for highly sparse gemm on fpgas

R Hojabr, A Sedaghati, A Sharifian… - … Symposium on High …, 2021 - ieeexplore.ieee.org

Generalized Sparse Matrix-Matrix Multiplication (Sparse GEMM) is widely used across
multiple domains, but the computation's regularity is dependent on the input sparsity pattern …

被引用次数：42 相关文章所有 3 个版本

Pythia: A customizable hardware prefetching framework using online reinforcement learning