Bouquet of instruction pointers: Instruction pointer classifier-based spatial hardware prefetching

S Pakalapati, B Panda - 2020 ACM/IEEE 47th Annual …, 2020 - ieeexplore.ieee.org
Hardware prefetching is one of the common off-chip DRAM latency hiding techniques.
Though hardware prefetchers are ubiquitous in the commercial machines and prefetching …

Clip: Load criticality based data prefetching for bandwidth-constrained many-core systems

B Panda - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Hardware prefetching is a latency-hiding technique that hides the costly off-chip DRAM
accesses. However, state-of-the-art prefetchers fail to deliver performance improvement in …

Combining prefetch control and cache partitioning to improve multicore performance

G Sun, J Shen, AV Veidenbaum - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Modern commercial multi-core processors are equipped with multiple hardware prefetchers
on each core. The prefetchers can significantly improve application performance. However …

Machine learning for fine-grained hardware prefetcher control

J Hiebel, LE Brown, Z Wang - … of the 48th International Conference on …, 2019 - dl.acm.org
Modern architectures provide hardware memory prefetching capabilities which can be
configured at runtime. While hardware prefetching can provide substantial performance …

SPAC: A synergistic prefetcher aggressiveness controller for multi-core systems

B Panda - IEEE Transactions on Computers, 2016 - ieeexplore.ieee.org
In multi-core systems, prefetch requests of one core interfere with the demand and prefetch
requests of other cores at the shared resources, which causes prefetcher-caused …

Intelligent adaptation of hardware knobs for improving performance and power consumption

C Ortega, L Alvarez, M Casas, R Bertran… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
Current microprocessors include several knobs to modify the hardware behavior in order to
improve performance, power, and energy under different workload demands. An impractical …

Bandwidth-aware dynamic prefetch configuration for IBM POWER8

C Navarro, J Feliu, S Petit, ME Gomez… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Advanced hardware prefetch engines are being integrated in current high-performance
processors. Prefetching can boost the performance of most applications, however, the …

Band-pass prefetching: an effective prefetch management mechanism using prefetch-fraction metric in multi-core systems

A Sridharan, B Panda, A Seznec - ACM Transactions on Architecture and …, 2017 - dl.acm.org
In multi-core systems, an application's prefetcher can interfere with the memory requests of
other applications using the shared resources, such as last level cache and memory …

COPE: Reducing Cache Pollution and Network Contention by Inter-tile Coordinated Prefetching in NoC-based MPSoCs

D Deb, J Jose, M Palesi - ACM Transactions on Design Automation of …, 2020 - dl.acm.org
Prefetching helps in reducing the memory access latency in multi-banked NUCA
architecture, where the Last Level Cache (LLC) is shared. In such systems, an application …

DeepP: deep learning multi-program prefetch configuration for the IBM POWER 8

M Lurbe, J Feliu, S Petit, ME Gómez… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Current multi-core processors implement sophisticated hardware prefetchers, that can be
configured by application (PID), to improve the system performance. When running multiple …