Content-based chunk placement scheme for decentralized deduplication on distributed file systems

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The rapid growth of data size causes several problems such as storage limitation and increment of data management cost. In order to store and manage massive data, Distributed File System (DFS) is widely used. Furthermore, in order to reduce the volume of storage, data deduplication schemes are being extensively studied. The data deduplication increases the available storage capacity by eliminating duplicated data. However, deduplication process causes performance overhead such as disk I/O. In this paper, we propose a content-based chunk placement scheme to increase deduplication rate on the DFS. To avoid performance overhead caused by deduplication process, we use lessfs in each chunk server. With our design, our system performs decentralized deduplication process in each chunk server. Moreover, we use consistent hashing for chunk allocation and failure recovery. Our experimental results show that the proposed system reduces the storage space by 60% than the system without consistent hashing.

Original languageEnglish
Title of host publicationComputational Science and Its Applications, ICCSA 2013 - 13th International Conference, Proceedings
PublisherSpringer Verlag
Pages173-183
Number of pages11
EditionPART 1
ISBN (Print)9783642396366
DOIs
StatePublished - 2013
Event13th International Conference on Computational Science and Its Applications, ICCSA 2013 - Ho Chi Minh City, Viet Nam
Duration: 24 Jun 201327 Jun 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7971 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Computational Science and Its Applications, ICCSA 2013
Country/TerritoryViet Nam
CityHo Chi Minh City
Period24/06/1327/06/13

Keywords

  • Chunk placement
  • Consistent hashing
  • Deduplication
  • Distributed file system

Fingerprint

Dive into the research topics of 'Content-based chunk placement scheme for decentralized deduplication on distributed file systems'. Together they form a unique fingerprint.

Cite this