Graphq: Scalable pim-based graph processing
Processing-In-Memory (PIM) architectures based on recent technology advances (eg,
Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing …
Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing …
Damaris: How to efficiently leverage multicore parallelism to achieve scalable, jitter-free I/O
With exascale computing on the horizon, the performance variability of I/O systems
represents a key challenge in sustaining high performance. In many HPC applications, I/O is …
represents a key challenge in sustaining high performance. In many HPC applications, I/O is …
Using run-time reconfiguration for fault injection in hardware prototypes
L Antoni, R Leveugle, M Feher - 17th IEEE International …, 2002 - ieeexplore.ieee.org
In this paper, a new methodology for the injection of single event upsets (SEU) in memory
elements is introduced. SEUs in memory elements can occur due to many reasons (eg …
elements is introduced. SEUs in memory elements can occur due to many reasons (eg …
Light-weight communications on Intel's single-chip cloud computer processor
RF Van der Wijngaart, TG Mattson… - ACM SIGOPS Operating …, 2011 - dl.acm.org
Many-core chips are changing the way high-performance computing systems are built and
programmed. As it is becoming increasingly difficult to maintain cache coherence across …
programmed. As it is becoming increasingly difficult to maintain cache coherence across …
PTG: an abstraction for unhindered parallelism
Increased parallelism and use of heterogeneous computing resources is now an
established trend in High Performance Computing (HPC), a trend that, looking forward to …
established trend in High Performance Computing (HPC), a trend that, looking forward to …
Parallel job scheduling for power constrained HPC systems
Power has become the primary constraint in high performance computing. Traditionally,
parallel job scheduling policies have been designed to improve certain job performance …
parallel job scheduling policies have been designed to improve certain job performance …
Network interface and protocol
SL Pope, DE Roberts, DJ Riddoch… - US Patent 7,844,742, 2010 - Google Patents
A communication interface for providing an interface between a data link and a data
processor, the data processor being capable of supporting an operating system and a user …
processor, the data processor being capable of supporting an operating system and a user …
MPI-aware compiler optimizations for improving communication-computation overlap
Several existing compiler transformations can help improve communication-computation
overlap in MPI applications. However, traditional compilers treat calls to the MPI library as a …
overlap in MPI applications. However, traditional compilers treat calls to the MPI library as a …
Encapsulated accelerator
SL Pope - US Patent 9,880,964, 2018 - Google Patents
A data processing system comprising a host computer system and a network interface
device for connection to a network, the host computer system and network interface device …
device for connection to a network, the host computer system and network interface device …
[PDF][PDF] Leveraging the cray linux environment core specialization feature to realize mpi asynchronous progress on cray xe systems
H Pritchard, D Roweth, D Henseler… - Proceedings of the Cray …, 2012 - cug.org
Cray has enhanced the Linux operating system with a Core Specialization (CoreSpec)
feature that allows for differentiated use of the compute cores available on Cray XE compute …
feature that allows for differentiated use of the compute cores available on Cray XE compute …