Quantifying the potential benefit of overlapping communication and computation in large-scale...

Y Zhuo, C Wang, M Zhang, R Wang, D Niu… - Proceedings of the …, 2019 - dl.acm.org

Processing-In-Memory (PIM) architectures based on recent technology advances (eg,
Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing …

被引用次数：171 相关文章所有 4 个版本

[PDF] hal.science

Damaris: How to efficiently leverage multicore parallelism to achieve scalable, jitter-free I/O

M Dorier, G Antoniu, F Cappello… - 2012 IEEE International …, 2012 - ieeexplore.ieee.org

With exascale computing on the horizon, the performance variability of I/O systems
represents a key challenge in sustaining high performance. In many HPC applications, I/O is …

被引用次数：149 相关文章所有 16 个版本

[PDF] psu.edu

Using run-time reconfiguration for fault injection in hardware prototypes

L Antoni, R Leveugle, M Feher - 17th IEEE International …, 2002 - ieeexplore.ieee.org

In this paper, a new methodology for the injection of single event upsets (SEU) in memory
elements is introduced. SEUs in memory elements can occur due to many reasons (eg …

被引用次数：181 相关文章所有 16 个版本

[PDF] researchgate.net

Light-weight communications on Intel's single-chip cloud computer processor

RF Van der Wijngaart, TG Mattson… - ACM SIGOPS Operating …, 2011 - dl.acm.org

Many-core chips are changing the way high-performance computing systems are built and
programmed. As it is becoming increasingly difficult to maintain cache coherence across …

被引用次数：138 相关文章所有 4 个版本

[PDF] psu.edu

PTG: an abstraction for unhindered parallelism

A Danalis, G Bosilca, A Bouteiller… - … on Domain-Specific …, 2014 - ieeexplore.ieee.org

Increased parallelism and use of heterogeneous computing resources is now an
established trend in High Performance Computing (HPC), a trend that, looking forward to …

被引用次数：73 相关文章所有 12 个版本

Parallel job scheduling for power constrained HPC systems

M Etinski, J Corbalan, J Labarta, M Valero - Parallel Computing, 2012 - Elsevier

Power has become the primary constraint in high performance computing. Traditionally,
parallel job scheduling policies have been designed to improve certain job performance …

被引用次数：81 相关文章所有 3 个版本

[PDF] googleapis.com

Network interface and protocol

SL Pope, DE Roberts, DJ Riddoch… - US Patent 7,844,742, 2010 - Google Patents

A communication interface for providing an interface between a data link and a data
processor, the data processor being capable of supporting an operating system and a user …

被引用次数：112 相关文章所有 4 个版本

[PDF] researchgate.net

MPI-aware compiler optimizations for improving communication-computation overlap

A Danalis, L Pollock, M Swany, J Cavazos - Proceedings of the 23rd …, 2009 - dl.acm.org

Several existing compiler transformations can help improve communication-computation
overlap in MPI applications. However, traditional compilers treat calls to the MPI library as a …

被引用次数：74 相关文章所有 8 个版本

[PDF] googleapis.com

Encapsulated accelerator

SL Pope - US Patent 9,880,964, 2018 - Google Patents

A data processing system comprising a host computer system and a network interface
device for connection to a network, the host computer system and network interface device …

被引用次数：47 相关文章所有 4 个版本

[PDF] cug.org

[PDF][PDF] Leveraging the cray linux environment core specialization feature to realize mpi asynchronous progress on cray xe systems

H Pritchard, D Roweth, D Henseler… - Proceedings of the Cray …, 2012 - cug.org

Cray has enhanced the Linux operating system with a Core Specialization (CoreSpec)
feature that allows for differentiated use of the compute cores available on Cray XE compute …

被引用次数：55 相关文章所有 2 个版本