A survey of recent prefetching techniques for processor caches
S Mittal - ACM Computing Surveys (CSUR), 2016 - dl.acm.org
As the trends of process scaling make memory systems an even more crucial bottleneck, the
importance of latency hiding techniques such as prefetching grows further. However, naively …
importance of latency hiding techniques such as prefetching grows further. However, naively …
Minimalist open-page: A DRAM page-mode scheduling policy for the many-core era
Contemporary DRAM systems have maintained impressive scaling by managing a careful
balance between performance, power, and storage density. In achieving these goals, a …
balance between performance, power, and storage density. In achieving these goals, a …
When prefetching works, when it doesn't, and why
In emerging and future high-end processor systems, tolerating increasing cache miss
latency and properly managing memory bandwidth will be critical to achieving high …
latency and properly managing memory bandwidth will be critical to achieving high …
[PDF][PDF] Access map pattern matching for high performance data cache prefetch
Y Ishii, M Inaba, K Hiraki - Journal of Instruction-Level Parallelism, 2011 - jilp.org
Hardware data prefetching is widely adopted to hide long memory latency. A hardware data
prefetcher predicts the memory address that will be accessed in the near future and fetches …
prefetcher predicts the memory address that will be accessed in the near future and fetches …
A review on shared resource contention in multicores and its mitigating techniques
Chip multiprocessor (CMP) systems have become inevitable to meet high computing
demands. In such systems sharing of resources is imperative for better resource utilisation …
demands. In such systems sharing of resources is imperative for better resource utilisation …
Limoncello: Prefetchers for Scale
This paper presents Limoncello, a novel software system that dynamically configures data
prefetchers for high-utilization systems. We demonstrate that in resource-constrained …
prefetchers for high-utilization systems. We demonstrate that in resource-constrained …
Multi-level hardware prefetching using low complexity delta correlating prediction tables with partial matching
This paper presents a low complexity table-based approach to delta correlation prefetching.
Our approach uses a table indexed by the load address which stores the latest deltas …
Our approach uses a table indexed by the load address which stores the latest deltas …
Semiconductor device including a global buffer shared by a plurality of memory controllers
K Park, WOO Su-Hae, S Kang - US Patent 10,157,152, 2018 - Google Patents
A semiconductor device includes a plurality of memory controllers each of which includes a
local buffer, a global buffer coupled to the plurality of memory controllers and including …
local buffer, a global buffer coupled to the plurality of memory controllers and including …
Band-pass prefetching: an effective prefetch management mechanism using prefetch-fraction metric in multi-core systems
In multi-core systems, an application's prefetcher can interfere with the memory requests of
other applications using the shared resources, such as last level cache and memory …
other applications using the shared resources, such as last level cache and memory …
Dynamically adjusting read data return sizes based on memory interface bus utilization
JS Dodson, SJ Powell, EE Retter… - US Patent 9,684,461, 2017 - Google Patents
A memory system comprises memory devices coupled to a memory controller via a memory
interface bus, the memory controller for receiving one or more memory requests via an …
interface bus, the memory controller for receiving one or more memory requests via an …