Tiny but mighty: designing and realizing scalable latency tolerance for manycore SoCs

M Orenes-Vera, A Manocha, J Balkind, F Gao… - Proceedings of the 49th …, 2022 - dl.acm.org
Modern computing systems employ significant heterogeneity and specialization to meet
performance targets at manageable power. However, memory latency bottlenecks remain …

[PDF][PDF] Performance improvement with circuit-level speculation

T Liu, SL Lu - Proceedings of the 33rd annual ACM/IEEE …, 2000 - dl.acm.org
Current superscalar microprocessors' performance depends on its frequency and the
number of useful instructions that can be processed per cycle (IPC). In this paper we …

Multithreading decoupled architectures for complexity-effective general purpose computing

M Sung, R Krashinsky, K Asanović - ACM SIGARCH Computer …, 2001 - dl.acm.org
Decoupled architectures have not traditionally been used in the context of general purpose
computing because of their inability to tolerate control-intensive code that exists across a …

Speculative Precomputation: Exploring the Use of Multithreading for Latency.

H Wang, PH Wang, RD Weldon… - Intel Technology …, 2002 - search.ebscohost.com
Speculative Precomputation (SP) is a technique to improve the latency of single-threaded
applications by utilizing idle multi-threading hardware resources to perform aggressive long …

Design and evaluation of a hierarchical decoupled architecture

WW Ro, SP Crago, AM Despain, JL Gaudiot - The Journal of …, 2006 - Springer
The speed gap between processor and main memory is the major performance bottleneck of
modern computer systems. As a result, today's microprocessors suffer from frequent cache …

[PDF][PDF] Microarchitectural miss/execute decoupling

A Roth, CB Zilles, GS Sohi - MEDEA Workshop, 2000 - ftp1.cs.wisc.edu
The decoupled access/execute architecture described a machine that enables the access of
memory values to be decoupled from the consumption of those values. Although never …

Navigating Heterogeneity and Scalability in Modern Chip Design

M Orenes-Vera - 2024 - search.proquest.com
Computing systems have become ubiquitous in the modern world but their design is far from
one-size-fits-all. From battery-powered devices to supercomputers, deployment …

[图书][B] On the Realization of Fine Grained Multithreading in Software

A Grävinghoff - 2002 - Citeseer
This work deals with the design, implementation and evaluation of a multithreading system
that enables fine-grained context switches without hardware support. The current chapter …

[图书][B] Programming Model and Execution Model for OpenMP on the Cyclops-64 Manycore Processor

G Gan - 2010 - capsl.udel.edu
During the last ten years, multicore processors have matured from academic research
projects to real products in industry. They are now used in across almost the entire spectrum …

Mini-graph processing

AW Bracy - 2008 - search.proquest.com
For years, single-thread performance was the most dominant force driving processor
development. In recent years, however, the poor scaling of single-thread super-scalar …