[HTML][HTML] Small files' problem in Hadoop: A systematic literature review

R Aggarwal, J Verma, M Siwach - … of King Saud University-Computer and …, 2022 - Elsevier
Apache Hadoop is an open-source software library which integrates a wide variety of
software tools and utilities to facilitate the distributed batch processing of big data sets …

Handling small size files in hadoop: Challenges, opportunities, and review

MA Ahad, R Biswas - Soft Computing in Data Analytics: Proceedings of …, 2019 - Springer
Recent technological advancements in the field of computing have been the cause of
voluminous generation of data which cannot be handled effectively by traditionally available …

Optimization strategy of Hadoop small file storage for big data in healthcare

H He, Z Du, W Zhang, A Chen - The Journal of Supercomputing, 2016 - Springer
As the era of “big data” comes, the data processing platform like Hadoop was born at the
right moment. But its carrier for storage, Hadoop distributed file system (HDFS) has the great …

Dynamic merging based small file storage (DM-SFS) architecture for efficiently storing small size files in hadoop

MA Ahad, R Biswas - Procedia computer science, 2018 - Elsevier
In today's computing era, the voluminous data that is generated every moment needs
special tools and techniques for its effective and efficient handling and storage. In this paper …

An efficient distributed caching for accessing small files in HDFS

K Bok, H Oh, J Lim, Y Pae, H Choi, B Lee, J Yoo - Cluster Computing, 2017 - Springer
In this paper, we propose a distributed caching scheme to efficiently access small files in
Hadoop distributed file system. The proposed scheme reduces the volume of metadata to …

Efficient storage of multi-sensor object-tracking data

X Hao, P Jin, L Yue - IEEE Transactions on Parallel and …, 2015 - ieeexplore.ieee.org
The rapid development of Internet of Things (IoT) enables people to track objects by
deploying multiple sensors, eg, to track people in indoor spaces using RFID sensors. Multi …

Addressing hadoop's small file problem with an appendable archive file format

T Renner, J Müller, L Thamsen, O Kao - Proceedings of the Computing …, 2017 - dl.acm.org
Hadoop has been used widely for data analytic tasks in various domains. At the same time,
data volume is expected to grow even further in the next years. Hadoop recently introduced …

xMeta: SSD-HDD-hybrid Optimization for Metadata Maintenance of Cloud-scale Object Storage

Y Chen, Q Ke, H Li, Y Wu, Y Zhang - ACM Transactions on Architecture …, 2024 - dl.acm.org
Object storage has been widely used in the cloud. Traditionally, the size of object metadata
is much smaller than that of object data, and thus existing object storage systems (such as …

Multi-tier storage system with dynamic power management utilizing configurable data mover modules

S Faibish, D Ting, JM Pedone Jr, P Tzelnic - US Patent 10,140,032, 2018 - Google Patents
An apparatus in one embodiment comprises a storage system having at least first and
second storage tiers each comprising a plurality of storage devices. The storage system …

Multi-tier storage system configured for efficient management of small files associated with Internet of Things

S Faibish, JM Bent, JM Pedone Jr - US Patent 10,853,315, 2020 - Google Patents
An apparatus in one embodiment comprises a multi-tier storage system having at least a
front-end storage tier, a back-end storage tier and a data mover module configured to control …