Analysis of data fragments in deduplication system

Zhike Zhang, Zejun Jiang, Chengzhang Peng, Zhiqiang Liu

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

To maximize the writing throughput of the deduplication system, most deduplication systems and deduplication clusters sequentially store new chunks in disk. This method results in data fragments as the deduplication system grows. It is important to analyse the data fragments in the deduplication system and to understand its features. We analyse the features of data fragments in deduplication system using three datasets from real world. We utilize File Fragment Degree (FFD) to quantize the data fragments of a file in deduplication system. We firstly implement Extreme Binning (EB) to collect the chunk addresses of every file in the dataset. Then, we design a FFD analyser to compute FFD for every file according to its chunk addresses and sizes. Finally, we analyse the FFD numbers. As far as we know, this is the first research on the analysis of data fragments in deduplication system. Our findings show that: 1) there are a large mount of data fragments in deduplication system for various datasets; 2) for enterprise backup data, the amount of data fragments increases rapidly as the deduplication system grows; 3) for dataset mainly containing small files, the amount of data fragments increases slowly as the deduplication system grows.

源语言英语
主期刊名Proceedings 2012 International Conference on System Science and Engineering, ICSSE 2012
559-563
页数5
DOI
出版状态已出版 - 2012
活动2012 International Conference on System Science and Engineering, ICSSE 2012 - Dalian, Liaoning, 中国
期限: 30 6月 20122 7月 2012

出版系列

姓名Proceedings 2012 International Conference on System Science and Engineering, ICSSE 2012

会议

会议2012 International Conference on System Science and Engineering, ICSSE 2012
国家/地区中国
Dalian, Liaoning
时期30/06/122/07/12

指纹

探究 'Analysis of data fragments in deduplication system' 的科研主题。它们共同构成独一无二的指纹。

引用此