Bouquet of instruction pointers: Instruction pointer classifier-based spatial hardware prefetching
S Pakalapati, B Panda - 2020 ACM/IEEE 47th Annual …, 2020 - ieeexplore.ieee.org
Hardware prefetching is one of the common off-chip DRAM latency hiding techniques.
Though hardware prefetchers are ubiquitous in the commercial machines and prefetching …
Though hardware prefetchers are ubiquitous in the commercial machines and prefetching …
Clip: Load criticality based data prefetching for bandwidth-constrained many-core systems
B Panda - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Hardware prefetching is a latency-hiding technique that hides the costly off-chip DRAM
accesses. However, state-of-the-art prefetchers fail to deliver performance improvement in …
accesses. However, state-of-the-art prefetchers fail to deliver performance improvement in …
Combining prefetch control and cache partitioning to improve multicore performance
Modern commercial multi-core processors are equipped with multiple hardware prefetchers
on each core. The prefetchers can significantly improve application performance. However …
on each core. The prefetchers can significantly improve application performance. However …
Machine learning for fine-grained hardware prefetcher control
Modern architectures provide hardware memory prefetching capabilities which can be
configured at runtime. While hardware prefetching can provide substantial performance …
configured at runtime. While hardware prefetching can provide substantial performance …
SPAC: A synergistic prefetcher aggressiveness controller for multi-core systems
B Panda - IEEE Transactions on Computers, 2016 - ieeexplore.ieee.org
In multi-core systems, prefetch requests of one core interfere with the demand and prefetch
requests of other cores at the shared resources, which causes prefetcher-caused …
requests of other cores at the shared resources, which causes prefetcher-caused …
Intelligent adaptation of hardware knobs for improving performance and power consumption
Current microprocessors include several knobs to modify the hardware behavior in order to
improve performance, power, and energy under different workload demands. An impractical …
improve performance, power, and energy under different workload demands. An impractical …
Bandwidth-aware dynamic prefetch configuration for IBM POWER8
Advanced hardware prefetch engines are being integrated in current high-performance
processors. Prefetching can boost the performance of most applications, however, the …
processors. Prefetching can boost the performance of most applications, however, the …
Band-pass prefetching: an effective prefetch management mechanism using prefetch-fraction metric in multi-core systems
In multi-core systems, an application's prefetcher can interfere with the memory requests of
other applications using the shared resources, such as last level cache and memory …
other applications using the shared resources, such as last level cache and memory …
COPE: Reducing Cache Pollution and Network Contention by Inter-tile Coordinated Prefetching in NoC-based MPSoCs
Prefetching helps in reducing the memory access latency in multi-banked NUCA
architecture, where the Last Level Cache (LLC) is shared. In such systems, an application …
architecture, where the Last Level Cache (LLC) is shared. In such systems, an application …
DeepP: deep learning multi-program prefetch configuration for the IBM POWER 8
Current multi-core processors implement sophisticated hardware prefetchers, that can be
configured by application (PID), to improve the system performance. When running multiple …
configured by application (PID), to improve the system performance. When running multiple …