GPUnet: Networking abstractions for GPU programs
Despite the popularity of GPUs in high-performance and scientific computing, and despite
increasingly general-purpose hardware capabilities, the use of GPUs in network servers or …
increasingly general-purpose hardware capabilities, the use of GPUs in network servers or …
Petascale high order dynamic rupture earthquake simulations on heterogeneous supercomputers
A Heinecke, A Breuer, S Rettenberger… - SC'14: Proceedings …, 2014 - ieeexplore.ieee.org
We present an end-to-end optimization of the innovative Arbitrary high-order DERivative
Discontinuous Galerkin (ADER-DG) software SeisSol targeting Intel® Xeon Phi coprocessor …
Discontinuous Galerkin (ADER-DG) software SeisSol targeting Intel® Xeon Phi coprocessor …
Bluesmpi: Efficient mpi non-blocking alltoall offloading designs on modern bluefield smart nics
In the state-of-the-art production quality MPI (Message Passing Interface) libraries,
communication progress is either performed by the main thread or a separate …
communication progress is either performed by the main thread or a separate …
A hierarchical and contextual model for aerial image parsing
J Porway, Q Wang, SC Zhu - International journal of computer vision, 2010 - Springer
In this paper we present a hierarchical and contextual model for aerial image understanding.
Our model organizes objects (cars, roofs, roads, trees, parking lots) in aerial scenes into …
Our model organizes objects (cars, roofs, roads, trees, parking lots) in aerial scenes into …
Flexdriver: A network driver for your accelerator
We propose a new system design for connecting hardware and FPGA accelerators to the
network, allowing the accelerator to directly control commodity Network Interface Cards …
network, allowing the accelerator to directly control commodity Network Interface Cards …
[PDF][PDF] The MVAPICH project: Evolution and sustainability of an open source production quality MPI library for HPC
I. OVERVIEW OF THE MVAPICH PROJECT The MVAPICH (for MPI-1) and MVAPICH2 (for
MPI-2 and MPI-3) open-source libraries [?] have been designed and developed during the …
MPI-2 and MPI-3) open-source libraries [?] have been designed and developed during the …
Amplification of probabilistic Boolean formulas
RB Boppana - 26th Annual Symposium on Foundations of …, 1985 - ieeexplore.ieee.org
The amplification of probabilistic Boolean formulas refers to combining independent copies
of such formulas to reduce the error probability. Les Valiant used the amplification method to …
of such formulas to reduce the error probability. Les Valiant used the amplification method to …
Exploiting GPUDirect RDMA in designing high performance OpenSHMEM for NVIDIA GPU clusters
GPUDirect RDMA (GDR) brings the high-performance communication capabilities of RDMA
networks like InfiniBand (IB) to GPUs (referred to as" Device"). It enables IB network …
networks like InfiniBand (IB) to GPUs (referred to as" Device"). It enables IB network …
Scalable communication architecture for network-attached accelerators
S Neuwirth, D Frey, M Nuessle… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
On the road to Exascale computing, novel communication architectures are required to
overcome the limitations of host-centric accelerators. Typically, accelerator devices require a …
overcome the limitations of host-centric accelerators. Typically, accelerator devices require a …
Exploring data migration for future deep-memory many-core systems
Upcoming high-performance computing (HPC) platforms will have more complex memory
hierarchies with high-bandwidth on-package memory and in the future also non-volatile …
hierarchies with high-bandwidth on-package memory and in the future also non-volatile …