Secureml: A system for scalable privacy-preserving machine learning

P Mohassel, Y Zhang - 2017 IEEE symposium on security and …, 2017 - ieeexplore.ieee.org
Machine learning is widely used in practice to produce predictive models for applications
such as image processing, speech and text recognition. These models are more accurate …

An integrated tutorial on InfiniBand, verbs, and MPI

P MacArthur, Q Liu, RD Russell… - … Surveys & Tutorials, 2017 - ieeexplore.ieee.org
This tutorial presents the details of the interconnection network utilized in many high
performance computing (HPC) systems today.“InfiniBand” is the hardware interconnect …

Generation of an error set that emulates software faults based on field data

J Christmansson, R Chillarege - Proceedings of Annual …, 1996 - ieeexplore.ieee.org
A significant issue in fault injection experiments is that the injected faults are representative
of software faults observed in the field. Another important issue is the time used, as we want …

High-performance routing with multipathing and path diversity in ethernet and HPC networks

M Besta, J Domke, M Schneider… - … on Parallel and …, 2020 - ieeexplore.ieee.org
The recent line of research into topology design focuses on lowering network diameter.
Many low-diameter topologies such as Slim Fly or Jellyfish that substantially reduce cost …

Deadlock-free oblivious routing for arbitrary topologies

J Domke, T Hoefler, WE Nagel - 2011 IEEE International …, 2011 - ieeexplore.ieee.org
Efficient deadlock-free routing strategies are crucial to the performance of large-scale
computing systems. There are many methods but it remains a challenge to achieve lowest …

HyperX topology: First at-scale implementation and comparison to the fat-tree

J Domke, S Matsuoka, IR Ivanov, Y Tsushima… - Proceedings of the …, 2019 - dl.acm.org
The de-facto standard topology for modern HPC systems and data-centers are Folded Clos
networks, commonly known as Fat-Trees. The number of network endpoints in these …

Fail-in-place network design: interaction between topology, routing algorithm and failures

J Domke, T Hoefler, S Matsuoka - SC'14: Proceedings of the …, 2014 - ieeexplore.ieee.org
The growing system size of high performance computers results in a steady decrease of the
mean time between failures. Exchanging network components often requires whole system …

Graph based routing algorithm for torus topology and its evaluation for the Angara interconnect

A Mukosey, A Semenov, A Tretiakov - Journal of Parallel and Distributed …, 2024 - Elsevier
Several approaches and techniques exist to resolve load balancing problem in general and
torus topology networks. Graph methods are natural ways to perform balancing of routing …

Scheduling-aware routing for supercomputers

J Domke, T Hoefler - SC'16: Proceedings of the International …, 2016 - ieeexplore.ieee.org
The interconnection network has a large influence on total cost, application performance,
energy consumption, and overall system efficiency of a supercomputer. Unfortunately …

Routing on the dependency graph: A new approach to deadlock-free high-performance routing

J Domke, T Hoefler, S Matsuoka - proceedings of the 25th ACM …, 2016 - dl.acm.org
Lossless interconnection networks are omnipresent in high performance computing
systems, data centers and network-on-chip architectures. Such networks require efficient …