CIDR: A cost-effective in-line data reduction system for terabit-per-second scale SSD arrays

  • Mohammadamin Ajdari
  • , Pyeongsu Park
  • , Joonsung Kim
  • , Dongup Kwon
  • , Jangwoo Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

An SSD array, a storage system consisting of multiple SSDs per node, has become a design choice to implement a fast primary storage system, and modern storage architects now aim to achieve terabit-per-second scale performance with the next-generation SSD array. To reduce the storage cost and improve the device endurability, such SSD array must employ data reduction schemes (i.e., deduplication, compression), which provide high data reduction capability at minimum costs. However, existing data reduction schemes do not scale with the fast increasing performance of an SSD array, due to inhibitive amount of CPU resources (e.g., in software-based schemes) or low data reduction ratio (e.g., in SSD device wide deduplication) or being cost ineffective to address workload changes in datacenters (e.g., in ASIC-based acceleration). In this paper, we propose CIDR, a novel FPGA-based, cost-effective data reduction system for an SSD array to achieve the terabit-per-second scale storage performance. Our key ideas are as follows. First, we decouple data reductionrelated computing tasks from the unscalable host CPUs by offloading them to a scalable array of FPGA boards. Second, we employ a centralized, node-wide metadata management scheme to achieve an SSD array-wide, high data reduction. Third, our FPGA-based reconfiguration adapts to different workload patterns by dynamically balancing the amount of software and hardware tasks running on CPUs and FPGAs, respectively. For evaluation, we built our example CIDR prototype achieving up to 12.8 GB/s (0.1 Tbps) on one FPGA. CIDR outperforms the baseline for a write-only workload by up to 2.47x and a mixed read-write workload by an expected 3.2x, respectively. We showed CIDR's scalability to achieve Tbps-scale performance by measuring a two-FPGA CIDR and projecting the performance impacts for more FPGAs.

Original languageEnglish
Title of host publicationProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages28-41
Number of pages14
ISBN (Electronic)9781728114446
DOIs
StatePublished - 26 Mar 2019
Externally publishedYes
Event25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019 - Washington, United States
Duration: 16 Feb 201920 Feb 2019

Publication series

NameProceedings - 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019

Conference

Conference25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019
Country/TerritoryUnited States
CityWashington
Period16/02/1920/02/19

Keywords

  • Compression
  • Deduplication
  • FPGA
  • SSD array

Fingerprint

Dive into the research topics of 'CIDR: A cost-effective in-line data reduction system for terabit-per-second scale SSD arrays'. Together they form a unique fingerprint.

Cite this