Collaborative multi-dimensional dataset processing with distributed cache infrastructure in the cloud

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

As modern large scale systems are built with a large number of independent small servers, it is becoming more important to orchestrate and leverage a large number of distributed buffer cache memory seamlessly. Several previous studies showed that with large scale distributed caching facilities, traditional resource scheduling policies often fail to exhibit high cache hit ratio and to achieve good system load balance. A scheduling policy that solely considers system load results in low cache hit ratio, and a scheduling policy that puts more emphasis on cache hit ratio than load balance suffers from system load imbalance. To maximize the overall system throughput, distributed caching facilities should balance the workloads and also leverage cached data at the same time. In this work, we present a distributed job processing framework that yields high cache hit ratio while achieving good system load balance, the two of which are most critical performance factors to improve overall system throughput and job response time. Our framework is a component-based distributed data analysis framework that supports geographically distributed multiple job schedulers. The job scheduler in our framework employs a distributed job scheduling policy-DEMA that considers both cache hit ratio and system load. In this paper, we show collaborative task scheduling can even further improve the performance by increasing the overall cache hit ratio while achieving load balance. Our experiments show that the proposed job scheduling policies outperform legacy load-based job scheduling policy in terms of job response time, load balancing, and cache hit ratio.

Original languageEnglish
Title of host publicationProceedings - 2014 International Conference on Cloud and Autonomic Computing, ICCAC 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages241-248
Number of pages8
ISBN (Electronic)9781479958412
DOIs
StatePublished - 26 Jan 2015
Externally publishedYes
Event2014 International Conference on Cloud and Autonomic Computing, ICCAC 2014 - London, United Kingdom
Duration: 8 Sep 201412 Sep 2014

Publication series

NameProceedings - 2014 International Conference on Cloud and Autonomic Computing, ICCAC 2014

Conference

Conference2014 International Conference on Cloud and Autonomic Computing, ICCAC 2014
Country/TerritoryUnited Kingdom
CityLondon
Period8/09/1412/09/14

Fingerprint

Dive into the research topics of 'Collaborative multi-dimensional dataset processing with distributed cache infrastructure in the cloud'. Together they form a unique fingerprint.

Cite this