Information Density Enhancement Using Lossy Compression in DNA Data Storage

Research output: Contribution to journalArticlepeer-review

Abstract

This study develops two deoxyribonucleic acid (DNA) lossy compression models, Models A and B, to encode grayscale images into DNA sequences, enhance information density, and enable high-fidelity image recovery. These models, distinguished by their handling of pixel domains and interpolation methods, offer a novel approach to data storage for DNA. Model A processes pixels in overlapped domains using linear interpolation (LI), whereas Model B uses non-overlapped domains with nearest-neighbor interpolation (NNI). Through a comparative analysis with Joint Photographic Experts Group (JPEG) compression, the DNA lossy compression models demonstrate competitive advantages in terms of information density and image quality restoration. The application of these models to the Modified National Institute of Standards and Technology (MNIST) dataset reveals their efficiency and the recognizability of decompressed images, which is validated by convolutional neural network (CNN) performance. In particular, Model B2, a version of Model B, emerges as an effective method for balancing high information density (surpassing over 20 times the typical densities of two bits per nucleotide) with reasonably good image quality. These findings highlight the potential of DNA-based data storage systems for high-density and efficient compression, indicating a promising future for biological data storage solutions.

Original languageEnglish
Article number2403071
JournalAdvanced Materials
Volume37
Issue number26
DOIs
StatePublished - 3 Jul 2025

Keywords

  • DNA data storage
  • MNIST classification
  • image quality assessment
  • information density
  • lossy compression

Fingerprint

Dive into the research topics of 'Information Density Enhancement Using Lossy Compression in DNA Data Storage'. Together they form a unique fingerprint.

Cite this