Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems

F Betzel, K Khatamifard, H Suresh, DJ Lilja… - ACM Computing …, 2018 - dl.acm.org
Approximate computing has gained research attention recently as a way to increase energy
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

[图书][B] Computer architecture: a quantitative approach

JL Hennessy, DA Patterson - 2017 - books.google.com
Computer Architecture: A Quantitative Approach, Sixth Edition has been considered
essential reading by instructors, students and practitioners of computer design for over 20 …

Phase tracking and prediction

T Sherwood, S Sair, B Calder - ACM SIGARCH Computer Architecture …, 2003 - dl.acm.org
In a single second a modern processor can execute billions of instructions. Obtaining a
bird's eye view of the behavior of a program at these speeds can be a difficult task when all …

Asyfunc: A high-performance and resource-efficient serverless inference system via asymmetric functions

Q Pei, Y Yuan, H Hu, Q Chen, F Liu - … of the 2023 ACM Symposium on …, 2023 - dl.acm.org
Recent advances in deep learning (DL) have spawned various intelligent cloud services
with well-trained DL models. Nevertheless, it is nontrivial to maintain the desired end-to-end …

Focusing processor policies via critical-path prediction

B Fields, S Rubin, R Bodik - … of the 28th annual international symposium …, 2001 - dl.acm.org
Although some instructions hurt performance more than others, current processors typically
apply scheduling and speculation as if each instruction was equally costly. Instruction cost …

I-CASH: Intelligently coupled array of SSD and HDD

Q Yang, J Ren - 2011 IEEE 17th International Symposium on …, 2011 - ieeexplore.ieee.org
This paper presents a new disk I/O architecture composed of an array of a flash memory
SSD (solid state disk) and a hard disk drive (HDD) that are intelligently coupled by a special …

In search of speculative thread-level parallelism

JT Oplinger, DL Heine, MS Lam - … International Conference on …, 1999 - ieeexplore.ieee.org
The paper focuses in the problem of how to find and effectively exploit speculative thread-
level parallelism. Our studies show that speculating only on loops does not yield sufficient …

Differential FCM: Increasing value prediction accuracy by improving table usage efficiency

B Goeman, H Vandierendonck… - … Symposium on High …, 2001 - ieeexplore.ieee.org
Value prediction is a relatively new technique to increase the Instruction Level Parallelism
(ILP) in future microprocessors. An important problem when designing a value predictor is …

On the value locality of store instructions

KM Lepak, MH Lipasti - Proceedings of the 27th annual international …, 2000 - dl.acm.org
Value locality, a recently discovered program attribute that describes the likelihood of the
recurrence of previously-seen program values, has been studied enthusiastically in the …