Skip to main navigation Skip to search Skip to main content

Compiler-assisted GPU thread throttling for reduced cache contention

  • Hyunjun Kim
  • , Sungin Hong
  • , Hyeonsu Lee
  • , Euiseong Seo
  • , Hwansoo Han

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Modern GPUs concurrently deploy thousands of threads to maximize thread level parallelism (TLP) for performance. For some applications, however, maximized TLP leads to significant performance degradation, as many concurrent threads compete for the limited amount of the data cache. In this paper, we propose a compiler-assisted thread throttling scheme, which limits the number of active thread groups to reduce cache contention and consequently improve the performance. A few dynamic thread throttling schemes have been proposed to alleviate cache contention by monitoring the cache behavior, but they often fail to provide timely responses to the dynamic changes in the cache behavior, as they adjust the parallelism afterwards in response to the monitored behavior. Our thread throttling scheme relies on compile-time adjustment of active thread groups to fit their memory footprints to the L1D capacity. We evaluated the proposed scheme with GPU programs that suffer from cache contention. Our approach improved the performance of original programs by 42.96% on average, and this is 8.97% performance boost in comparison to the static thread throttling schemes.

Original languageEnglish
Title of host publicationProceedings of the 48th International Conference on Parallel Processing, ICPP 2019
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450362955
DOIs
StatePublished - 5 Aug 2019
Event48th International Conference on Parallel Processing, ICPP 2019 - Kyoto, Japan
Duration: 5 Aug 20198 Aug 2019

Publication series

NameACM International Conference Proceeding Series

Conference

Conference48th International Conference on Parallel Processing, ICPP 2019
Country/TerritoryJapan
CityKyoto
Period5/08/198/08/19

Keywords

  • Cache Contention
  • GPGPU
  • Static Analysis
  • Thread Throttling

Fingerprint

Dive into the research topics of 'Compiler-assisted GPU thread throttling for reduced cache contention'. Together they form a unique fingerprint.

Cite this