Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems
F Betzel, K Khatamifard, H Suresh, DJ Lilja… - ACM Computing …, 2018 - dl.acm.org
Approximate computing has gained research attention recently as a way to increase energy
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …
DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
[图书][B] Computer architecture: a quantitative approach
JL Hennessy, DA Patterson - 2017 - books.google.com
Computer Architecture: A Quantitative Approach, Sixth Edition has been considered
essential reading by instructors, students and practitioners of computer design for over 20 …
essential reading by instructors, students and practitioners of computer design for over 20 …
Phase tracking and prediction
In a single second a modern processor can execute billions of instructions. Obtaining a
bird's eye view of the behavior of a program at these speeds can be a difficult task when all …
bird's eye view of the behavior of a program at these speeds can be a difficult task when all …
Asyfunc: A high-performance and resource-efficient serverless inference system via asymmetric functions
Recent advances in deep learning (DL) have spawned various intelligent cloud services
with well-trained DL models. Nevertheless, it is nontrivial to maintain the desired end-to-end …
with well-trained DL models. Nevertheless, it is nontrivial to maintain the desired end-to-end …
Focusing processor policies via critical-path prediction
B Fields, S Rubin, R Bodik - … of the 28th annual international symposium …, 2001 - dl.acm.org
Although some instructions hurt performance more than others, current processors typically
apply scheduling and speculation as if each instruction was equally costly. Instruction cost …
apply scheduling and speculation as if each instruction was equally costly. Instruction cost …
I-CASH: Intelligently coupled array of SSD and HDD
Q Yang, J Ren - 2011 IEEE 17th International Symposium on …, 2011 - ieeexplore.ieee.org
This paper presents a new disk I/O architecture composed of an array of a flash memory
SSD (solid state disk) and a hard disk drive (HDD) that are intelligently coupled by a special …
SSD (solid state disk) and a hard disk drive (HDD) that are intelligently coupled by a special …
In search of speculative thread-level parallelism
JT Oplinger, DL Heine, MS Lam - … International Conference on …, 1999 - ieeexplore.ieee.org
The paper focuses in the problem of how to find and effectively exploit speculative thread-
level parallelism. Our studies show that speculating only on loops does not yield sufficient …
level parallelism. Our studies show that speculating only on loops does not yield sufficient …
Differential FCM: Increasing value prediction accuracy by improving table usage efficiency
B Goeman, H Vandierendonck… - … Symposium on High …, 2001 - ieeexplore.ieee.org
Value prediction is a relatively new technique to increase the Instruction Level Parallelism
(ILP) in future microprocessors. An important problem when designing a value predictor is …
(ILP) in future microprocessors. An important problem when designing a value predictor is …
On the value locality of store instructions
KM Lepak, MH Lipasti - Proceedings of the 27th annual international …, 2000 - dl.acm.org
Value locality, a recently discovered program attribute that describes the likelihood of the
recurrence of previously-seen program values, has been studied enthusiastically in the …
recurrence of previously-seen program values, has been studied enthusiastically in the …