The worst-case execution-time problem—overview of methods and survey of tools
R Wilhelm, J Engblom, A Ermedahl, N Holsti… - ACM Transactions on …, 2008 - dl.acm.org
The determination of upper bounds on execution times, commonly called worst-case
execution times (WCETs), is a necessary step in the development and validation process for …
execution times (WCETs), is a necessary step in the development and validation process for …
GPU-accelerated molecular modeling coming of age
Graphics processing units (GPUs) have traditionally been used in molecular modeling solely
for visualization of molecular structures and animation of trajectories resulting from …
for visualization of molecular structures and animation of trajectories resulting from …
[HTML][HTML] OpenCL: A parallel programming standard for heterogeneous computing systems
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems - PMC Back
to Top Skip to main content NIH NLM Logo Access keys NCBI Homepage MyNCBI Homepage …
to Top Skip to main content NIH NLM Logo Access keys NCBI Homepage MyNCBI Homepage …
Introduction to the Cell multiprocessor
JA Kahle, MN Day, HP Hofstee… - IBM journal of …, 2005 - ieeexplore.ieee.org
This paper provides an introductory overview of the Cell multiprocessor. Cell represents a
revolutionary extension of conventional microprocessor architecture and organization. The …
revolutionary extension of conventional microprocessor architecture and organization. The …
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
As multicore architectures enter the mainstream, there is a pressing demand for high-level
programming models that can effectively map to them. Stream programming offers an …
programming models that can effectively map to them. Stream programming offers an …
Interconnects in the third dimension: Design challenges for 3D ICs
K Bernstein, P Andry, J Cann, P Emma… - Proceedings of the 44th …, 2007 - dl.acm.org
Despite generation upon generation of scaling, computer chips have until now remained
essentially 2-dimensional. Improvements in on-chip wire delay and in the maximum number …
essentially 2-dimensional. Improvements in on-chip wire delay and in the maximum number …
GViM: GPU-accelerated virtual machines
V Gupta, A Gavrilovska, K Schwan, H Kharche… - Proceedings of the 3rd …, 2009 - dl.acm.org
The use of virtualization to abstract underlying hardware can aid in sharing such resources
and in efficiently managing their use by high performance applications. Unfortunately …
and in efficiently managing their use by high performance applications. Unfortunately …
Synergistic processing in cell's multicore architecture
M Gschwind, HP Hofstee, B Flachs, M Hopkins… - IEEE micro, 2006 - ieeexplore.ieee.org
Eight synergistic processor units enable the Cell Broadband Engine's breakthrough
performance. The SPU architecture implements a novel, pervasively data-parallel …
performance. The SPU architecture implements a novel, pervasively data-parallel …
Cell multiprocessor communication network: Built for speed
Multicore designs promise various power-performance and area-performance benefits. But
inadequate design of the on-chip communication network can deprive applications of these …
inadequate design of the on-chip communication network can deprive applications of these …
Extending Amdahl's law for energy-efficient computing in the many-core era
Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era Page 1
Extending Amdahl’s Law for Energy-Efficient Computing in the Many-Core Era Dong Hyuk Woo …
Extending Amdahl’s Law for Energy-Efficient Computing in the Many-Core Era Dong Hyuk Woo …