Advanced Data Management Capabilities Using MapReduce
Google Summer of Code 2009 ideas
Primary Mentor: Shantenu Jha sjha@cct.lsu.edu
Secondary Mentor: Andre Merzky andremerzky@gmail.com
.
OMII Project: OMII-SAGA
.
Background
Last year GSoC funded a project: Implementing MapReduce in SAGA
. This
project led to two peer-reviewed papers, on at Cloud 2009 (held in
Conjunction with CCGrid) and the other at WGCV (Workshop on Grids Clouds
and Virtualization) held in conjunction at GPC. Copies of these papers
can be found at:
- http://www.cct.lsu.edu/~sjha/publications/saga_cloud_interop.pdf
- http://www.cct.lsu.edu/~sjha/publications/saga_data_intensive.pdf
Project Goals
We would like to enhance SAGA based MapReduce to make it more effective for real world scientific problems of large data-sets and more complex data-dependencies.
Project Description
MapReduce implemented using SAGA has shown to be able to handle large data-sets over many distributed machines with different architectures. However, the Reduce stage of this implementation must wait for the Map stage to finish before it can begin. Google as well as Hadoop have a combiner function, which - for commutative and associative reduce functions - executes on each machine that performs a Map task. Typically the combiner function is the same as the reduce function but it runs in the map stage. This partial combining of data before reduce speeds up many classes of MapReduce operations.
Also, MapReduce implemented with SAGA does not take into consideration data locality. If some data is located on a machine that must map, it isn't taken into consideration. We would like to use ping times and data locality to implement a more aware framework. Finally, we would like to extend MapReduce with more debugging abilities and a more verbose output.
Project Requirements
- A beginners understanding of SAGA and MapReduce.
- A LOT of enthusiasm and ability to imbibe caffeine as well as a burning desire to invest a summer soaking up radiation from a computer screen.





© The University of Southampton on behalf of OMII-UK. All Rights Reserved. |