The microarchitecture of the synergistic processor for a cell processor

E Kasapaki, M Schoeberl, RB Sørensen… - … Transactions on Very …, 2015 - ieeexplore.ieee.org

In this paper, we present an area-efficient, globally asynchronous, locally synchronous
network-on-chip (NoC) architecture for a hard real-time multiprocessor platform. The NoC …

被引用次数：124 相关文章所有 9 个版本

[PDF] jst.go.jp

Towards ultra-high-speed cryogenic single-flux-quantum computing

K Ishida, M Tanaka, T Ono, K Inoue - IEICE Transactions on …, 2018 - search.ieice.org

CMOS microprocessors are limited in their capacity for clock speed improvement because of
increasing computing power, ie, they face a power-wall problem. Single-flux-quantum (SFQ) …

被引用次数：23 相关文章所有 6 个版本

[PDF] researchgate.net

Chip multiprocessing and the cell broadband engine

M Gschwind - Proceedings of the 3rd conference on Computing …, 2006 - dl.acm.org

Chip multiprocessing has become an exciting new direction for system designers to deliver
increased performance by exploiting CMOS scaling. We discuss key design decisions facing …

被引用次数：203 相关文章所有 12 个版本

[图书][B] Embedded DSP processor design: Application specific instruction set processors

D Liu - 2008 - books.google.com

This book provides design methods for Digital Signal Processors and Application Specific
Instruction set Processors, based on the author's extensive, industrial design experience …

被引用次数：144 相关文章所有 6 个版本

[PDF] researchgate.net

The Cell Broadband Engine: exploiting multiple levels of parallelism in a chip multiprocessor

M Gschwind - International journal of parallel programming, 2007 - Springer

As CMOS feature sizes continue to shrink and traditional microarchitectural methods for
delivering high performance (eg, deep pipelining) become too expensive and power …

被引用次数：165 相关文章所有 17 个版本

[PDF] iczhiku.com

A programmable 512 GOPS stream processor for signal, image, and video processing

BK Khailany, T Williams, J Lin, EP Long… - IEEE Journal of solid …, 2008 - ieeexplore.ieee.org

A 34-million transistor stream processor system-on-chip (SoC) for signal, image, and video
processing contains 80 parallel integer ALUs organized into 16 data-parallel lanes with a 5 …

被引用次数：146 相关文章所有 11 个版本

[PDF] psu.edu

Introduction to the cell broadband engine architecture

CR Johns, DA Brokenshire - IBM Journal of Research and …, 2007 - ieeexplore.ieee.org

This paper provides an overview of the Cell Broadband Engine™ Architecture (CBEA). The
CBEA defines a revolutionary extension to a more conventional processor organization and …

被引用次数：130 相关文章所有 5 个版本

[PDF] google.com

μManycore: A Cloud-Native CPU for Tail at Scale

J Stojkovic, C Liu, M Shahbaz, J Torrellas - Proceedings of the 50th …, 2023 - dl.acm.org

Microservices are emerging as a popular cloud-computing paradigm. Microservice
environments execute typically-short service requests that interact with one another via …

被引用次数：7 相关文章所有 5 个版本

[PDF] irisa.fr

Optimizing matrix multiplication for a short-vector SIMD architecture–CELL processor

J Kurzak, W Alvaro, J Dongarra - Parallel Computing, 2009 - Elsevier

Matrix multiplication is one of the most common numerical operations, especially in the area
of dense linear algebra, where it forms the core of many important algorithms, including …

被引用次数：84 相关文章所有 18 个版本

[PDF] archive.org

Multi-core acceleration of chemical kinetics for simulation and prediction

JC Linford, J Michalakes, M Vachharajani… - Proceedings of the …, 2009 - dl.acm.org

This work implements a computationally expensive chemical kinetics kernel from a large-
scale community atmospheric model on three multi-core platforms: NVIDIA GPUs using …

被引用次数：71 相关文章所有 7 个版本