Understanding memory access patterns for prefetching

N Wu, Y Xie - ACM Computing Surveys (CSUR), 2022 - dl.acm.org

It has been a long time that computer architecture and systems are optimized for efficient
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …

被引用次数：81 相关文章所有 4 个版本

[PDF] arxiv.org

A survey of machine learning applied to computer architecture design

DD Penney, L Chen - arXiv preprint arXiv:1909.12373, 2019 - arxiv.org

Machine learning has enabled significant benefits in diverse fields, but, with a few
exceptions, has had limited impact on computer architecture. Recent work, however, has …

被引用次数：41 相关文章所有 2 个版本

[PDF] acm.org

Mira: A program-behavior-guided far memory system

Z Guo, Z He, Y Zhang - Proceedings of the 29th Symposium on …, 2023 - dl.acm.org

Far memory, where memory accesses are non-local, has become more popular in recent
years as a solution to expand memory size and avoid memory stranding. Prior far memory …

被引用次数：17 相关文章所有 3 个版本

[PDF] acm.org

Fine-grained address segmentation for attention-based variable-degree prefetching

P Zhang, A Srivastava, AV Nori, R Kannan… - Proceedings of the 19th …, 2022 - dl.acm.org

Machine learning algorithms have shown potential to improve prefetching performance by
accurately predicting future memory accesses. Existing approaches are based on the …

被引用次数：22 相关文章所有 4 个版本

[PDF] hotstorage.org

Cache in hand: Expander-driven cxl prefetcher for next generation cxl-ssd

M Kwon, S Lee, M Jung - Proceedings of the 15th ACM Workshop on …, 2023 - dl.acm.org

Integrating compute express link (CXL) with SSDs allows scalable access to large memory
but has slower speeds than DRAMs. We present ExPAND, an expander-driven CXL …

被引用次数：14 相关文章所有 5 个版本

[PDF] acm.org

Twig: Profile-guided btb prefetching for data center applications

TA Khan, N Brown, A Sriraman… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Modern data center applications have deep software stacks, with instruction footprints that
are orders of magnitude larger than typical instruction cache (I-cache) sizes. To efficiently …

被引用次数：34 相关文章所有 7 个版本

[PDF] acm.org

CRISP: critical slice prefetching

H Litz, G Ayers, P Ranganathan - Proceedings of the 27th ACM …, 2022 - dl.acm.org

The high access latency of DRAM continues to be a performance challenge for
contemporary microprocessor systems. Prefetching is a well-established technique to …

被引用次数：21 相关文章所有 4 个版本

[PDF] mzjournal.com

Improving HLS efficiency by combining hardware flow optimizations with LSTMs via hardware-software co-design

H Sadasivan, F Lai, H Al Muraf… - Journal of Engineering and …, 2020 - mzjournal.com

The translation of C programs to Verilog can present significant challenges for programmers
aiming to synthesize hardware. To address these challenges, several High-Level Synthesis …

被引用次数：29 相关文章

[PDF] acm.org

Improving the accuracy, adaptability, and interpretability of SSD failure prediction models

C Chakraborttii, H Litz - Proceedings of the 11th ACM Symposium on …, 2020 - dl.acm.org

Flash-based solid state drives represent an important storage tier in today's hyperscale data
centers. Although solid state drives (SSDs) are relatively reliable, data center operators are …

被引用次数：34 相关文章所有 4 个版本

[PDF] acm.org

Raop: Recurrent neural network augmented offset prefetcher

P Zhang, A Srivastava, B Brooks, R Kannan… - Proceedings of the …, 2020 - dl.acm.org

The rapid development of Big Data coupled with slowing down of Moore's law has made the
memory performance a bottleneck in the von Neumann architecture. Machine learning has …

被引用次数：26 相关文章所有 3 个版本