Makespan reduction for dynamic workloads in cluster-based data grids using reinforcement-learning based scheduling

Authorsمهشید هلالی مقدم,سید مرتضی بابامیر
JournalJournal of Computational Science
Page number402
Volume number24
IF1.925
Paper TypeFull Paper
Published At2017-10-11
Journal GradeScientific - research
Journal TypeElectronic
Journal CountryIran, Islamic Republic Of
Journal IndexISI ,SCOPUS

Abstract

Scheduling is one of the important problems within the scope of control and management in grid and cloud-based systems. Data grid still as a primary solution to process data-intensive tasks, deals with managing large amounts of distributed data in multiple nodes. In this paper, a two-phase learning-based scheduling algorithm is proposed for data-intensive tasks scheduling in cluster-based data grids. In the proposed scheduling algorithm, a hierarchical multi agent system, consisting of one global broker agent and several local agents, is applied to scheduling procedure in the cluster-based data grids. At the first step of the proposed scheduling algorithm, the global broker agent selects the cluster with the minimum data cost based on the data communication cost measure, then an adaptive policy based on Q-learning is used by the local agent of the selected cluster to schedule the task to the proper node of the cluster. The impacts of three action selection strategies have been investigated in the proposed scheduling algorithm, and the performance of different versions of the scheduling algorithm regarding different action selection strategies, has been evaluated under three types of workloads with heterogeneous tasks. Experimental results show that for dynamic workloads with varying task submission patterns, the proposed learning-based scheduling algorithm gives better performance compared to four common scheduling algorithm, Queue Length (Shortest Queue), Access Cost, Queue Access Cost (QAC) and HCS, which use regular combinations of primary parameters such as, data communication cost and queue length. Applying a learning-based strategy provides the scheduling algorithm with more adaptability to the changing conditions in the environment.

tags: Data grid, Data-intensive task scheduling algorithm, Data communication cost, reinforcement learning.