OMII-UK Home

Advanced Data Management Capabilities Using MapReduce

Google Summer of Code 2009 ideas

Primary Mentor: Shantenu Jha sjha@cct.lsu.edu
Secondary Mentor: Andre Merzky andremerzky@gmail.com.
OMII Project: OMII-SAGA.

Background

Last year GSoC funded a project: Implementing MapReduce in SAGA. This project led to two peer-reviewed papers, on at Cloud 2009 (held in Conjunction with CCGrid) and the other at WGCV (Workshop on Grids Clouds and Virtualization) held in conjunction at GPC. Copies of these papers can be found at:

Project Goals

We would like to enhance SAGA based MapReduce to make it more effective for real world scientific problems of large data-sets and more complex data-dependencies.

Project Description

MapReduce implemented using SAGA has shown to be able to handle large data-sets over many distributed machines with different architectures. However, the Reduce stage of this implementation must wait for the Map stage to finish before it can begin. Google as well as Hadoop have a combiner function, which - for commutative and associative reduce functions - executes on each machine that performs a Map task. Typically the combiner function is the same as the reduce function but it runs in the map stage. This partial combining of data before reduce speeds up many classes of MapReduce operations.

Also, MapReduce implemented with SAGA does not take into consideration data locality. If some data is located on a machine that must map, it isn't taken into consideration. We would like to use ping times and data locality to implement a more aware framework. Finally, we would like to extend MapReduce with more debugging abilities and a more verbose output.

Project Requirements

  • A beginners understanding of SAGA and MapReduce.
  • A LOT of enthusiasm and ability to imbibe caffeine as well as a burning desire to invest a summer soaking up radiation from a computer screen.

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-1) was last changed on 13-Mar-2009 13:26 by MarioAntonioletti [RSS]

© The University of Southampton on behalf of OMII-UK. All Rights Reserved. | Terms of Use | Privacy Policy | PageRank Checker