High performance RDMA-based MPI implementation over InfiniBand
J Liu, J Wu, SP Kini, P Wyckoff, DK Panda - Proceedings of the 17th …, 2003 - dl.acm.org
Although InfiniBand Architecture is relatively new in the high performance computing area, it
offers many features which help us to improve the performance of communication …
offers many features which help us to improve the performance of communication …
Load value approximation
Approximate computing explores opportunities that emerge when applications can tolerate
error or inexactness. These applications, which range from multimedia processing to …
error or inexactness. These applications, which range from multimedia processing to …
Practical data value speculation for future high-end processors
Dedicating more silicon area to single thread performance will necessarily be considered as
worthwhile in future-potentially heterogeneous-multicores. In particular, Value prediction …
worthwhile in future-potentially heterogeneous-multicores. In particular, Value prediction …
Exploiting value locality in physical register files
S Balakrishnan, GS Sohi - Proceedings. 36th Annual IEEE …, 2003 - ieeexplore.ieee.org
The physical register file is an important component of a dynamically-scheduled processor.
Increasing the amount of parallelism places increasing demands on the physical register …
Increasing the amount of parallelism places increasing demands on the physical register …
A survey of value prediction techniques for leveraging value locality
S Mittal - Concurrency and computation: practice and …, 2017 - Wiley Online Library
Value locality (VL) refers to recurrence of values in a memory structure, and value prediction
(VP) refers to predicting VL and leveraging it for diverse optimizations. VP holds the promise …
(VP) refers to predicting VL and leveraging it for diverse optimizations. VP holds the promise …
EOLE: Paving the way for an effective implementation of value prediction
Even in the multicore era, there is a continuous demand to increase the performance of
single-threaded applications. However, the conventional path of increasing both issue width …
single-threaded applications. However, the conventional path of increasing both issue width …
Prefetch injection based on hardware monitoring and object metadata
AR Adl-Tabatabai, RL Hudson, MJ Serrano… - ACM SIGPLAN …, 2004 - dl.acm.org
Cache miss stalls hurt performance because of the large gap between memory and
processor speeds-for example, the popular server benchmark SPEC JBB2000 spends 45 …
processor speeds-for example, the popular server benchmark SPEC JBB2000 spends 45 …
Enhancing memory level parallelism via recovery-free value prediction
The ever-increasing computational power of contemporary microprocessors reduces the
execution time spent on arithmetic computations (ie, the computations not involving slow …
execution time spent on arithmetic computations (ie, the computations not involving slow …
Leveraging targeted value prediction to unlock new hardware strength reduction potential
A Perais - MICRO-54: 54th Annual IEEE/ACM International …, 2021 - dl.acm.org
Value Prediction (VP) is a microarchitectural technique that speculatively breaks data
dependencies to increase the available Instruction Level Parallelism (ILP) in general …
dependencies to increase the available Instruction Level Parallelism (ILP) in general …
[图书][B] Speculative execution in high performance computer architectures
Until now, there were few textbooks that focused on the dynamic subject of speculative
execution, a topic that is crucial to the development of high performance computer …
execution, a topic that is crucial to the development of high performance computer …