Argo: A real-time network-on-chip architecture with an efficient GALS implementation

E Kasapaki, M Schoeberl, RB Sørensen… - … Transactions on Very …, 2015 - ieeexplore.ieee.org
In this paper, we present an area-efficient, globally asynchronous, locally synchronous
network-on-chip (NoC) architecture for a hard real-time multiprocessor platform. The NoC …

Towards ultra-high-speed cryogenic single-flux-quantum computing

K Ishida, M Tanaka, T Ono, K Inoue - IEICE Transactions on …, 2018 - search.ieice.org
CMOS microprocessors are limited in their capacity for clock speed improvement because of
increasing computing power, ie, they face a power-wall problem. Single-flux-quantum (SFQ) …

Chip multiprocessing and the cell broadband engine

M Gschwind - Proceedings of the 3rd conference on Computing …, 2006 - dl.acm.org
Chip multiprocessing has become an exciting new direction for system designers to deliver
increased performance by exploiting CMOS scaling. We discuss key design decisions facing …

[图书][B] Embedded DSP processor design: Application specific instruction set processors

D Liu - 2008 - books.google.com
This book provides design methods for Digital Signal Processors and Application Specific
Instruction set Processors, based on the author's extensive, industrial design experience …

The Cell Broadband Engine: exploiting multiple levels of parallelism in a chip multiprocessor

M Gschwind - International journal of parallel programming, 2007 - Springer
As CMOS feature sizes continue to shrink and traditional microarchitectural methods for
delivering high performance (eg, deep pipelining) become too expensive and power …

A programmable 512 GOPS stream processor for signal, image, and video processing

BK Khailany, T Williams, J Lin, EP Long… - IEEE Journal of solid …, 2008 - ieeexplore.ieee.org
A 34-million transistor stream processor system-on-chip (SoC) for signal, image, and video
processing contains 80 parallel integer ALUs organized into 16 data-parallel lanes with a 5 …

Introduction to the cell broadband engine architecture

CR Johns, DA Brokenshire - IBM Journal of Research and …, 2007 - ieeexplore.ieee.org
This paper provides an overview of the Cell Broadband Engine™ Architecture (CBEA). The
CBEA defines a revolutionary extension to a more conventional processor organization and …

μManycore: A Cloud-Native CPU for Tail at Scale

J Stojkovic, C Liu, M Shahbaz, J Torrellas - Proceedings of the 50th …, 2023 - dl.acm.org
Microservices are emerging as a popular cloud-computing paradigm. Microservice
environments execute typically-short service requests that interact with one another via …

Optimizing matrix multiplication for a short-vector SIMD architecture–CELL processor

J Kurzak, W Alvaro, J Dongarra - Parallel Computing, 2009 - Elsevier
Matrix multiplication is one of the most common numerical operations, especially in the area
of dense linear algebra, where it forms the core of many important algorithms, including …

Multi-core acceleration of chemical kinetics for simulation and prediction

JC Linford, J Michalakes, M Vachharajani… - Proceedings of the …, 2009 - dl.acm.org
This work implements a computationally expensive chemical kinetics kernel from a large-
scale community atmospheric model on three multi-core platforms: NVIDIA GPUs using …