GCStack: A GPU Cycle Accounting Mechanism for Providing Accurate Insight Into GPU Performance

  • Hanna Cha
  • , Sungchul Lee
  • , Yeonan Ha
  • , Hanhwi Jang
  • , Joonsung Kim
  • , Youngsok Kim

Research output: Contribution to journalArticlepeer-review

Abstract

Cycles Per Instruction (CPI) stacks help computer architects gain insight into the performance of their target architectures and applications. To bring the benefits of CPI stacks to Graphics Processing Units (GPUs), prior studies have proposed GPU cycle accounting mechanisms that can identify the stall cycles and their stall events on GPU architectures. Unfortunately, the prior studies cannot provide accurate insight into the GPU performance due to their coarse-grained, priority-driven, and issue-centric cycle accounting mechanisms. In this letter, we present GCStack, a fine-grained GPU cycle accounting mechanism that constructs accurate CPI stacks and accurately identifies primary GPU performance bottlenecks. GCStack first exposes all the stall events of the outstanding warps of a warp scheduler, most of which get hidden by the existing mechanisms. Then, GCStack defers the classification of structural stalls, which the existing mechanisms cannot correctly identify with their issue-stage-centric stall classification, to the later stages of the GPU pipeline. We implement GCStack on Accel-Sim and show that GCStack provides more accurate CPI stacks and GPU performance insight than GSI, the state-of-the-art GPU cycle accounting mechanism whose primary focus is on characterizing memory-related stalls.

Original languageEnglish
Pages (from-to)235-238
Number of pages4
JournalIEEE Computer Architecture Letters
Volume23
Issue number2
DOIs
StatePublished - 2024

Keywords

  • CPI stack
  • cycle accounting
  • GPU

Fingerprint

Dive into the research topics of 'GCStack: A GPU Cycle Accounting Mechanism for Providing Accurate Insight Into GPU Performance'. Together they form a unique fingerprint.

Cite this