Best practices and lessons learned from deploying and operating large-scale data-centric parallel file systems
S Oral, J Simmons, J Hill, D Leverman… - SC'14: Proceedings …, 2014 - ieeexplore.ieee.org
The Oak Ridge Leadership Computing Facility (OLCF) has deployed multiple large-scale
parallel file systems (PFS) to support its operations. During this process, OLCF acquired …
parallel file systems (PFS) to support its operations. During this process, OLCF acquired …
[PDF][PDF] Olcfs 1 tb/s, next-generation lustre file system
Abstract The Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National
Laboratory (ORNL) has a long history of deploying the world's fastest supercomputers to …
Laboratory (ORNL) has a long history of deploying the world's fastest supercomputers to …
Theta: Rapid installation and acceptance of an XC40 KNL system
In order to provide a stepping stone from the Argonne Leadership Computing Facility's
(ALCF) world class production 10 petaFLOP IBM BlueGene/Q system, Mira, to its next …
(ALCF) world class production 10 petaFLOP IBM BlueGene/Q system, Mira, to its next …
Improving large-scale storage system performance via topology-aware and balanced data placement
With the advent of big data, the I/O subsystems of large-scale compute clusters are
becoming a center of focus. More applications are putting greater demands on end-to-end …
becoming a center of focus. More applications are putting greater demands on end-to-end …
[HTML][HTML] Accelerating network communication and I/O in scientific high performance computing environments
SM Neuwirth - 2019 - ub.uni-heidelberg.de
High performance computing has become one of the major drivers behind technology
inventions and science discoveries. Originally driven through the increase of operating …
inventions and science discoveries. Originally driven through the increase of operating …
PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences
Deep Learning Recommendation Models (DLRMs) have become increasingly popular and
prevalent in today's datacenters, consuming most of the AI inference cycles. The …
prevalent in today's datacenters, consuming most of the AI inference cycles. The …
Alleviating i/o interference through workload-aware striping and load-balancing on parallel file systems
Y Tsujita, T Yoshizaki, K Yamamoto, F Sueyasu… - … Conference, ISC High …, 2017 - Springer
Nowadays parallel file systems have been widely used in many supercomputers. Lustre is
one of the most used parallel file systems, and its enhanced file system named FEFS (Fujitsu …
one of the most used parallel file systems, and its enhanced file system named FEFS (Fujitsu …
[PDF][PDF] A next-generation parallel file system environment for the OLCF
GM Shipman, DA Dillow, D Fuller… - Proceedings of Cray …, 2012 - cug.org
When deployed in 2008/2009 the Spider system at the Oak Ridge National Laboratory's
Leadership Computing Facility (OLCF) was the world's largest scale Lustre parallel file …
Leadership Computing Facility (OLCF) was the world's largest scale Lustre parallel file …
[PDF][PDF] I/O router placement and fine-grained routing on Titan to support Spider II
The Oak Ridge Leadership Computing Facility (OLCF) introduced the concept of Fine-
Grained Routing in 2008 to improve I/O performance between the Jaguar supercomputer …
Grained Routing in 2008 to improve I/O performance between the Jaguar supercomputer …
面向分层混合存储架构的协同式突发缓冲技术
周恩强, 张伟, 董勇, 卢宇彤 - 国防科技大学学报, 2015 - journal.nudt.edu.cn
科学计算产生和分析的数据规模日益增长, 高性能计算机的存储系统在体系架构和软件管理方法
上面临重大挑战. 针对天河-2 系统的新型分层混合存储架构, 提出一种由应用程序耦合的协同式 …
上面临重大挑战. 针对天河-2 系统的新型分层混合存储架构, 提出一种由应用程序耦合的协同式 …