Adaptive weight compression for memory-efficient neural networks

  • Jong Hwan Ko
  • , Duckhwan Kim
  • , Taesik Na
  • , Jaeha Kung
  • , Saibal Mukhopadhyay

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Neural networks generally require significant memory capacity/bandwidth to store/access a large number of synaptic weights. This paper presents an application of JPEG image encoding to compress the weights by exploiting the spatial locality and smoothness of the weight matrix. To minimize the loss of accuracy due to JPEG encoding, we propose to adaptively control the quantization factor of the JPEG algorithm depending on the error-sensitivity (gradient) of each weight. With the adaptive compression technique, the weight blocks with higher sensitivity are compressed less for higher accuracy. The adaptive compression reduces memory requirement, which in turn results in higher performance and lower energy of neural network hardware. The simulation for inference hardware for multilayer perceptron with the MNIST dataset shows up to 42X compression with less than 1% loss of recognition accuracy, resulting in 3X higher effective memory bandwidth and ∼19X lower system energy.

Original languageEnglish
Title of host publicationProceedings of the 2017 Design, Automation and Test in Europe, DATE 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages199-204
Number of pages6
ISBN (Electronic)9783981537093
DOIs
StatePublished - 11 May 2017
Externally publishedYes
Event20th Design, Automation and Test in Europe, DATE 2017 - Swisstech, Lausanne, Switzerland
Duration: 27 Mar 201731 Mar 2017

Publication series

NameProceedings of the 2017 Design, Automation and Test in Europe, DATE 2017

Conference

Conference20th Design, Automation and Test in Europe, DATE 2017
Country/TerritorySwitzerland
CitySwisstech, Lausanne
Period27/03/1731/03/17

Keywords

  • Compression
  • JPEG
  • Memory-efficient
  • MLP
  • Neural network
  • Weight

Fingerprint

Dive into the research topics of 'Adaptive weight compression for memory-efficient neural networks'. Together they form a unique fingerprint.

Cite this