A survey and classification of storage deduplication systems
The automatic elimination of duplicate data in a storage system, commonly known as
deduplication, is increasingly accepted as an effective technique to reduce storage costs …
deduplication, is increasingly accepted as an effective technique to reduce storage costs …
Big data reduction framework for value creation in sustainable enterprises
Value creation is a major sustainability factor for enterprises, in addition to profit
maximization and revenue generation. Modern enterprises collect big data from various …
maximization and revenue generation. Modern enterprises collect big data from various …
A comprehensive study of the past, present, and future of data deduplication
Data deduplication, an efficient approach to data reduction, has gained increasing attention
and popularity in large-scale storage systems due to the explosive growth of digital data. It …
and popularity in large-scale storage systems due to the explosive growth of digital data. It …
POCLib: A high-performance framework for enabling near orthogonal processing on compression
Parallel technology boosts data processing in recent years, and parallel direct data
processing on hierarchically compressed documents exhibits great promise. The high …
processing on hierarchically compressed documents exhibits great promise. The high …
Big data reduction methods: a survey
Research on big data analytics is entering in the new phase called fast data where multiple
gigabytes of data arrive in the big data systems every second. Modern big data systems …
gigabytes of data arrive in the big data systems every second. Modern big data systems …
[PDF][PDF] iDedup: latency-aware, inline data deduplication for primary storage.
K Srinivasan, T Bisson, GR Goodson, K Voruganti - Fast, 2012 - usenix.org
Deduplication technologies are increasingly being deployed to reduce cost and increase
space-efficiency in corporate data centers. However, prior research has not applied …
space-efficiency in corporate data centers. However, prior research has not applied …
Design tradeoffs for data deduplication performance in backup workloads
Data deduplication has become a standard component in modern backup systems. In order
to understand the fundamental tradeoffs in each of its design choices (such as prefetching …
to understand the fundamental tradeoffs in each of its design choices (such as prefetching …
{FastCDC}: A fast and efficient {Content-Defined} chunking approach for data deduplication
Content-Defined Chunking (CDC) has been playing a key role in data deduplication
systems in the past 15 years or so due to its high redundancy detection abil-ity. However …
systems in the past 15 years or so due to its high redundancy detection abil-ity. However …
Wan-optimized replication of backup datasets using stream-informed delta compression
P Shilane, M Huang, G Wallace, W Hsu - ACM Transactions on Storage …, 2012 - dl.acm.org
Replicating data off site is critical for disaster recovery reasons, but the current approach of
transferring tapes is cumbersome and error prone. Replicating across a wide area network …
transferring tapes is cumbersome and error prone. Replicating across a wide area network …
Accelerating restore and garbage collection in deduplication-based backup systems via exploiting historical information
In deduplication-based backup systems, the chunks of each backup are physically scattered
after deduplication, which causes a challenging fragmentation problem. The fragmentation …
after deduplication, which causes a challenging fragmentation problem. The fragmentation …