A content aware chunking scheme for data de-duplication in archival storage systems

来源 :高技术通讯(英文版) | 被引量 : 0次 | 上传用户:qinghuawuqiong
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Based on variable sized chunking,this paper proposes a content aware chunking scheme,called CAC,that does not assume fully random file contents,but tonsiders the characteristics of the file types.CAC uses a candidate anchor histogram and the file-type specific knowledge to refine how anchors are determined when performing deduplication of file data and enforces the selected average chunk size.CAC yields more chunks being found which in turn produces smaller average chunks and a better reduction in data.We present a detailed evaluation of CAC and the experimental results show that this scheme can improve the compression ratio chunking for file types whose bytes are not randomly distributed (from 11.3% to 16.7% according to different datasets),and improve the write throughput on average by 9.7%.
其他文献
结直肠癌是人类最常见的消化系统恶性肿瘤,且发病率逐年增高,目前已居男性恶性肿瘤发病率的第3位,女性发病率的第2位[1]。随着我国人口老龄化及人民生活方式、饮食习惯的改变,结
The paraffin wax microemulsion was prepared from fully refined paraffin wax No.58-60 in the presence of a nonionic surfactant and an anionic surfactant.The infl
A novelty technique,namely,pre-oxidation,has been proposed to improve the strength and thermal shock behavior of ZrB2-SiC-graphite ceramic composites,which is a
Ball bearings are widely employed mechanical components characterized by high precision and quality,and usually play important roles in various rotary machines
Servo scanning 3D micro electrical discharge machining (3D SSMEDM) is a novel and effective method in fabricating complex 3D micro structures with high aspect r
In nature,to realize the smooth motion for different speeds,the continuous gait transition is usually required for the 0quadrupeds.Thus,the gait simulation of q
The traditional inspection methods are mostly based on manual inspection which is very likely to make erroneous judgments due to personal subjectivity or eye fa
Directional solidified turbine blades of Ni-based superalloy are widely used as key parts of the gas turbine engines.The mechanical properties of the blade are
An adaptive fuzzy tracking control scheme is presented for a class of switched multi-input-multi-output (MIMO) nonlinear systems with disturbances under arbitra
Zinc-indium-tin oxide (ZITO) films were grown by pulsed-laser deposition. Three different material compositions were investigated: ZITO-30, ZITO-50 and ZITO-70