OGSA-DAI based data integration projects at AIST
Previous article: New Development for GridSAM
Next article: Collaborative visualisation with RAVE
A wide range of data is currently published through Web-accessible search interfaces, including Bioinformatics databases such as UniProt, and publication databases such as CiteSeer and PubMed. OGSA-WebDB is an extension of OGSA-DAI that allows users to integrate data from multiple Web databases.
OGSA-WebDB is developed at the National Institute of Advanced Industrial Science and Technology (AIST) in Japan. It is based on the mediator/wrapper approach to data integration. Wrapper components are developed for each different Web database, a mediator component uses the wrappers to obtain data, which is then combined to answer queries. OGSA-WebDB supports the SQL query language, which allows users to write declarative queries that reference different remote Web databases as if they were local tables. To achieve this, each wrapper maps the data published by a Web database to a relational schema, and the mediator merges each local schema into a global schema which, in turn, is used to construct queries. Various optimisation strategies are employed so that queries are answered as quickly as possible, which is essential when joining data from multiple, possibly very large, remote databases.
The OGSA-WebDB project is just one part of an effort at AIST to look at ways of integrating heterogeneous data using OGSA-DAI. OGSA-DAI-RDF is a set of activities that allow RDF stores, such as Jena and Sesame, to be queried using SPARQL. While OGSA-WebDB and OGSA-DAI-RDF provide data access, it is also highly desirable to achieve declarative data integration by extending OGSA-DAI's distributed query processor component: OGSA-DQP. Extensions to OGSA-DQP have been developed to support XML manipulation and OGSA-WebDB, with OGSA-DAI-RDF possibly being supported in the future.
Steven Lynden, AIST.





© The University of Southampton on behalf of OMII-UK. All Rights Reserved. |