Efficient metadata indexing for hpc storage systems
AK Paul, B Wang, N Rutman, C Spitz… - 2020 20th IEEE/ACM …, 2020 - ieeexplore.ieee.org
The increase in data generation rate along with the scale of today's high performance
computing (HPC) storage systems make finding and managing files extremely difficult …
computing (HPC) storage systems make finding and managing files extremely difficult …
An integrated indexing and search service for distributed file systems
Data services such as search, discovery, and management in scalable distributed
environments have traditionally been decoupled from the underlying file systems, and are …
environments have traditionally been decoupled from the underlying file systems, and are …
Miqs: Metadata indexing and querying service for self-describing file formats
Scientific applications often store datasets in self-describing data file formats, such as HDF5
and netCDF. Regrettably, to efficiently search the metadata within these files remains …
and netCDF. Regrettably, to efficiently search the metadata within these files remains …
Strategy for research data management services in Indonesia
E Marlina, B Purwandari - Procedia Computer Science, 2019 - Elsevier
Research data management (RDM) ensures the availability of data access and long term
data preservation. Its practices are common in developed countries. On the other hand, it is …
data preservation. Its practices are common in developed countries. On the other hand, it is …
SciSpace: A scientific collaboration workspace for geo-distributed HPC data centers
Future terabit networks are committed to dramatically improving big data motion between
geographically dispersed HPC data centers. The scientific community takes advantage of …
geographically dispersed HPC data centers. The scientific community takes advantage of …
Gufi: fast, secure file system metadata search for both privileged and unprivileged users
D Manno, J Lee, P Challa, Q Zheng… - … Conference for High …, 2022 - ieeexplore.ieee.org
Modern High-Performance Computing (HPC) data centers routinely store massive data sets
resulting in millions of directories and billions of files. To efficiently search and sift through …
resulting in millions of directories and billions of files. To efficiently search and sift through …
A content fingerprint-based cluster-wide inline deduplication for shared-nothing storage systems
Deduplication has been principally employed in distributed storage systems to improve
storage space efficiency. Traditional deduplication research ignores the design …
storage space efficiency. Traditional deduplication research ignores the design …
Exploring metadata search essentials for scientific data management
Scientific experiments and observations store massive amounts of data in various scientific
file formats. Metadata, which describes the characteristics of the data, is commonly used to …
file formats. Metadata, which describes the characteristics of the data, is commonly used to …
Scanns: Towards scalable and concurrent data indexing and searching in high-end computing system
AI Orhean, A Giannakou… - 2022 22nd IEEE …, 2022 - ieeexplore.ieee.org
Increasing data volumes, particularly in science and engineering, has resulted in the
widespread adoption of parallel and distributed file systems for data storage and access …
widespread adoption of parallel and distributed file systems for data storage and access …
Hades: A context-aware active storage framework for accelerating large-scale data analysis
Modern simulation workflows generate and analyze massive amounts of data using I/O
libraries like Adios2 and NetCDF. Although extensive work has optimized the I/O processes …
libraries like Adios2 and NetCDF. Although extensive work has optimized the I/O processes …