[PDF][PDF] H5spark: bridging the i/o gap between spark and scientific data formats on hpc systems

J Liu, E Racah, Q Koziol, RS Canon, A Gittens… - Cray user …, 2016 - researchgate.net
The Spark framework has been tremendously powerful for performing Big Data analytics in
distributed data centers. However, using Spark to analyze large-scale scientific data on HPC …

A data-driven approach to nation-scale building energy modeling

AS Berres, BC Bass, MB Adams… - … Conference on Big …, 2021 - ieeexplore.ieee.org
In 2019, 125 million US residential and commercial buildings consumed $412 billion in
energy bills. These buildings currently consume 40% of the nation's primary energy, 73% of …

Client-side straggler-aware I/O scheduler for object-based parallel file systems

N Tavakoli, D Dai, Y Chen - Parallel Computing, 2019 - Elsevier
Object-based parallel file systems have emerged as promising storage solutions for high-
performance computing (HPC) systems. Despite the fact that object storage provides a …

Log-assisted straggler-aware I/O scheduler for high-end computing

N Tavakoli, D Dai, Y Chen - 2016 45th International …, 2016 - ieeexplore.ieee.org
Object-based parallel file systems have emerged as promising storage solutions for high-
end computing (HEC) systems. Despite the fact that object storage provides a flexible …

Concurrent dynamic memory coalescing on GoblinCore-64 architecture

X Wang, JD Leidel, Y Chen - … of the Second International Symposium on …, 2016 - dl.acm.org
The majority of modern microprocessors are architected to utilize multi-level data caches as
a primary optimization to reduce the latency and increase the perceived bandwidth from an …

Accelerating Columnar Storage Based on Asynchronous Skipping Strategy

W Li, Z Yang, L Deng, Z Cheng, W Wen, Y He - Big Data Research, 2023 - Elsevier
Many database applications, such as OnLine Analytical Processing (OLAP), web-based
information extraction or scientific computation, need to select a subset of fields based on …

In situ storage layout optimization for amr spatio-temporal read accesses

H Tang, S Byna, S Harenberg, W Zhang… - 2016 45th …, 2016 - ieeexplore.ieee.org
Analyses of large simulation data often concentrate on regions in space and in time that
contain important information. As simulations adopt Adaptive Mesh Refinement (AMR), the …

Heavy-tailed distribution of parallel I/O system response time

B Dong, S Byna, K Wu - Proceedings of the 10th Parallel Data Storage …, 2015 - dl.acm.org
Estimating I/O time of applications is critical for computing system research and
developments, such as performance tuning and job scheduling. Parallel I/O systems on …

[PDF][PDF] Debugging in Parallel or Sequential: An Empirical Study.

Y Pang, X Xue, AS Namin - J. Softw., 2015 - jsoftware.us
Faults need to be identified, localized, and removed from programs. Empirical studies show
that coverage-based faults localizations effectively target bugs, even in the presence of …

[PDF][PDF] Distributed nosql storage for extreme-scale system services

T Li, I Raicu - IEEE/ACM Supercomputing PhD Showcase, 2015 - 216.47.155.57
Today with the rapidly accumulated data, datadriven applications are emerging in science
and commercial areas. On both HPC systems and clouds the continuously widening …