Entropy-aware I/O pipelining for large-scale deep learning on HPC systems Y Zhu, F Chowdhury, H Fu, A Moody, K Mohror, K Sato, W Yu 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation …, 2018 | 72 | 2018 |
I/o characterization and performance evaluation of beegfs for deep learning F Chowdhury, Y Zhu, T Heer, S Paredes, A Moody, R Goldstone, ... Proceedings of the 48th International Conference on Parallel Processing, 1-10, 2019 | 71 | 2019 |
Efficient user-level storage disaggregation for deep learning Y Zhu, W Yu, B Jiao, K Mohror, A Moody, F Chowdhury 2019 IEEE International Conference on Cluster Computing (CLUSTER), 1-12, 2019 | 42 | 2019 |
Oaws: Memory occlusion aware warp scheduling B Wang, Y Zhu, W Yu Proceedings of the 2016 International Conference on Parallel Architectures …, 2016 | 31 | 2016 |
Direct-fuse: Removing the middleman for high-performance fuse file system support Y Zhu, T Wang, K Mohror, A Moody, K Sato, M Khan, W Yu Proceedings of the 8th International Workshop on Runtime and Operating …, 2018 | 25 | 2018 |
Metakv: A key-value store for metadata management of distributed burst buffers T Wang, A Moody, Y Zhu, K Mohror, K Sato, T Islam, W Yu 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017 | 24 | 2017 |
FARMS: Efficient mapreduce speculation for failure recovery in short jobs H Fu, H Chen, Y Zhu, W Yu Parallel Computing 61, 68-82, 2017 | 19 | 2017 |
SHMemCache: enabling memcached on the OpenSHMEM global address model H Fu, K Singharoy, MG Venkata, Y Zhu, W Yu OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid …, 2016 | 8 | 2016 |
A case study of mapreduce speculation for failure recovery H Fu, Y Zhu, W Yu Proceedings of the 2015 International Workshop on Data-Intensive Scalable …, 2015 | 8 | 2015 |
Emulating I/O behavior in scientific workflows on high performance computing systems F Chowdhury, Y Zhu, F Di Natale, A Moody, E Gonsiorowski, K Mohror, ... 2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW), 34-39, 2020 | 7 | 2020 |
Multi-client DeepIO for large-scale deep learning on HPC systems Y Zhu, F Chowdhury, H Fu, A Moody, K Mohror, K Sato, W Yu Proceedings of the International Conference on High Performance Computing …, 2018 | 4 | 2018 |
Enhancing MapReduce Fault Recovery Through Binocular Speculation H Fu, Y Zhu, AK Nath, MM Khan, W Yu arXiv preprint arXiv:1901.07715, 2019 | 1 | 2019 |
Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUs C Espenshade, R Peng, E Hong, M Calman, Y Zhu, P Parida, EK Lee, ... Proceedings of the 4th Workshop on Machine Learning and Systems, 47-55, 2024 | | 2024 |
Towards Pareto Optimal Throughput in Small Language Model Serving PG Recasens, Y Zhu, C Wang, EK Lee, O Tardieu, A Youssef, J Torres, ... Proceedings of the 4th Workshop on Machine Learning and Systems, 144-152, 2024 | | 2024 |
User-Level I/O Accelerations for High-Performance Deep Learning Applications Y Zhu The Florida State University, 2021 | | 2021 |
Direct-FUSE K SATO, KM MOHROR, AT MOODY, Y Zhu, M Khan, T Wang, W Yu Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States), 2019 | | 2019 |
Characterization and Tuning of BeeGFS for Deep Neural Networks F Chowdhury, Y Zhu, T Heer, A Moody, K Mohror, W Yu Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States), 2018 | | 2018 |
MetaKV: A Specialized Key-Value Store for Distrbuted Burst Buffer Systems T Wang, A Moody, Y Zhu, K Sato, K Mohror, T Islam, W Yu Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States), 2016 | | 2016 |