Enhancing Map-Reduce Job Execution on Geo-Distributed Data Across Datacenters

Title: Enhancing Map-Reduce Job Execution on Geo-Distributed Data Across Datacenters
Publisher: Guru Nanak Publications
ISSN: 2278-0947
Series: Volume 4 Issue 2
Authors: P. Jebilla and P. Jayashree


Abstract

Recent era the size of the data set we need to handle grows rapidly. Efficiently analyzing Big data has always been an issue in our current era. Cloud computing along with the implementations of MapReduce framework provides a parallel processing model and associated implementation to process huge amount of data. In cloud, in many scenarios the input data set are geographically distributed across datacenters. This paper deals with enhancing the MapReduce job execution on the geo distributed data. Possible execution paths are analyzed. A Data Transformation Graph is used to determine the schedules for the job sequences which are optimized using the Shortest Path Algorithm. The proposed model deals with the extending of the Existing Dijkstra’s algorithm to consider the node weight in addition to the edge weight. Ozzie workflow and mapper side joins are used to reduce the execution time and cost. As shown by the comparisons the execution time of the MapReduce job execution has been enhanced.

Keywords

cloud computing, data center, map reduce, geo-distribution.

Download Full Text

(For complimentary copy, please contact Chief_editor@innovationjournals.com)