SCARL: Attentive reinforcement learning-based scheduling in a multi-resource heterogeneous cluster

  • Mukoe Cheong
  • , Hyunsung Lee
  • , Ikjun Yeom
  • , Honguk Woo

Research output: Contribution to journalArticlepeer-review

33 Scopus citations

Abstract

Advanced reinforcement learning (RL) technologies have recently increased the opportunity for automating several tasks in cluster management at scale by exploiting repetitive logs of cluster operation and building a learning model for resource allocation and job scheduling. Yet, this trend of adopting RL in the domain of cluster management has not fully addressed the diversity and heterogeneity of jobs and machines in modern cluster environments. In this paper, we present an RL-based scheduler for a multi-resource cluster, namely SCARL (SCheduler with Attentive Reinforcement Learning), concentrating on intricate cluster operating conditions with different resource requirements and capabilities. Specifically, we employ attentive embedding and factored-action scheduling that together efficiently incorporate time-varying interdependency of jobs and machines in RL processing; they enable an end-to-end scalable policy for scheduling diverse jobs on heterogeneous machines. To the best of our knowledge, we are the first to employ attention mechanism in RL-based cluster resource management. Through experiments, we demonstrate that our approach is competitive with existing heuristic methods under various cluster simulation configurations, e.g., an average 9.2 % enhancement in slowdown over the shortest job first algorithm. Additionally, the approach yields stable performance with our test cluster for running synthetic workloads based on real traces.

Original languageEnglish
Article number8876692
Pages (from-to)153432-153444
Number of pages13
JournalIEEE Access
Volume7
DOIs
StatePublished - 2019

Keywords

  • attention
  • attentive embedding
  • attentive reinforcement learning
  • Cluster resource management

Fingerprint

Dive into the research topics of 'SCARL: Attentive reinforcement learning-based scheduling in a multi-resource heterogeneous cluster'. Together they form a unique fingerprint.

Cite this