GAROS: Genetic algorithm-aided row-skipping for shift and duplicate kernel mapping in processing-in-memory architectures

Research output: Contribution to journalArticlepeer-review

Abstract

Processing-in-memory (PIM) architecture is becoming a promising candidate for convolutional neural network (CNN) inference. A recent mapping method, shift and duplicate kernel (SDK), enhances latency by improving array utilization through shifting the same kernels into idle columns. Although pattern-based pruning effectively enables row-skipping, traditional pattern designs are suboptimal for SDK mapping due to the irregular kernel shifts, complicating row-skipping. To address this, we proposed pruning-aided row-skipping (PAIRS), which adopts SDK-optimized layer-wise patterns. However, PAIRS has two key limitations: it offers discrete row-skipping by using single pattern set, restricting precise control over the weight matrix compression for varying layer and array sizes, and it risks accuracy loss by pruning critical weights. To overcome these challenges, we introduce genetic algorithm-aided row-skipping (GAROS), which employs input channel (IC)-wise patterns. GAROS enables finer control over row-skipping by assigning several pattern sets and selecting optimal patterns to each IC for preserving critical weights. Consequently, this approach enables continuous weight matrix compression while balancing the trade-off between row-skipping and accuracy. Simulation results in WRN16-4 demonstrate that GAROS improved accuracy by up to +2.4% compared to PAIRS and achieved up to a 1.74× speedup compared to baseline when 128 × 128 sub-array is used.

Original languageEnglish
Article number103423
JournalJournal of Systems Architecture
Volume165
DOIs
StatePublished - Aug 2025

Keywords

  • Convolutional neural network
  • Pattern-based pruning
  • Processing in memory
  • Weight mapping

Fingerprint

Dive into the research topics of 'GAROS: Genetic algorithm-aided row-skipping for shift and duplicate kernel mapping in processing-in-memory architectures'. Together they form a unique fingerprint.

Cite this