Column-wise Quantization of Weights and Partial Sums for Accurate and Efficient Compute-In-Memory Accelerators

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Compute-in-memory (CIM) is an efficient method for implementing deep neural networks (DNNs) but suffers from substantial overhead from analog-to-digital converters (ADCs), especially as ADC precision increases. Low-precision ADCs can reduce this overhead but introduce partial-sum quantization errors degrading accuracy. Additionally, low-bit weight constraints, imposed by cell limitations and the need for multiple cells for higher-bit weights, present further challenges. While fine-grained partial-sum quantization has been studied to lower ADC resolution effectively, weight granularity, which limits overall partial-sum quantized accuracy, remains underexplored. This work addresses these challenges by aligning weight and partial-sum quantization granularities at the column-wise level. Our method improves accuracy while maintaining dequantization overhead, simplifies training by removing two-stage processes, and ensures robustness to memory cell variations via independent column-wise scale factors. We also propose an open-source CIM-oriented convolution framework to handle fine-grained weights and partial-sums efficiently, incorporating a novel tiling method and group convolution. Experimental results on ResNet-20 (CIFAR-10, CIFAR-100) and ResNet-18 (ImageNet) show accuracy improvements of 0.99%, 2.69%, and 1.01%, respectively, compared to the best-performing related works. Additionally, variation analysis reveals the robustness of our method against memory cell variations. These findings highlight the effectiveness of our quantization scheme in enhancing accuracy and robustness while maintaining hardware efficiency in CIM-based DNN implementations. Our code is available at https://github.com/jiyoonkm/ColumnQuant.

Original languageEnglish
Title of host publication2025 Design, Automation and Test in Europe Conference, DATE 2025 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9783982674100
DOIs
StatePublished - 2025
Event2025 Design, Automation and Test in Europe Conference, DATE 2025 - Lyon, France
Duration: 31 Mar 20252 Apr 2025

Publication series

NameProceedings -Design, Automation and Test in Europe, DATE
ISSN (Print)1530-1591

Conference

Conference2025 Design, Automation and Test in Europe Conference, DATE 2025
Country/TerritoryFrance
CityLyon
Period31/03/252/04/25

Keywords

  • Compute-in-memory
  • convolutional neural networks
  • quantization

Fingerprint

Dive into the research topics of 'Column-wise Quantization of Weights and Partial Sums for Accurate and Efficient Compute-In-Memory Accelerators'. Together they form a unique fingerprint.

Cite this