OMII-UK Home

Nwsltr0609A/NwsLtr0609.png

Cardiology will pump the deep web

by Simon Hettrick, OMII-UK
Nwsltr0609A/heart.png

Back to index page

Previous article: The cancer BioInformatics Grid chooses Taverna

Next article: Rapid portlets are a hit with chemists

Nwsltr0609A/NwsLine.png

The Deep Web is the mass of data—making up about three quarters of the Web—that is inaccessible to most users. This hidden data represents a critical resource to consider as we embark on the creation of the Semantic Web. The SADI (Semantic Automated Discovery and Integration) project is working to expose Deep Web data – any Deep Web data – as if it were a Semantic Web resource. One early success has been the CardioSHARE project, which has exposed data for the use of cardiovascular health researchers.

One can imagine the connections between every conceivable input on the Web and the outputs of every conceivable Web Service as a virtual graph. This graph could be used to discover information by tracing the desired output from a Web Service through to the necessary input. However, the virtual graph changes constantly as the underlying analytical tools and data resources change. Rather than attempting to keep up with these changes – and falling prey to the well-known problems suffered by data warehouses – SADI dynamically queries the virtual graph as if it existed, but without instantiating it permanently. This is performed by constructing only those segments of the virtual graph needed to answer a given question at a given time, which it does by discovering and invocating the appropriate Web Services.

SADI is a Web Service framework that uses Semantic Web technologies (RDF/OWL) to discover and invocate Web Services. In this way, the output from a Web Service can be dynamically exposed on the Semantic Web when it is needed. SADI provides a prototype, standards-based query interface that explores these dynamically exposed Semantic Web resources, making them appear to be traditional, Semantic Web data stores.

CardioSHARE (Cardiovascular Semantic Health and Research Environment) is based at the iCAPTURE Centre for Cardiovascular and Pulmonary Research in Vancouver. It is an application of SADI aimed at health researchers, which allows complex queries to be simplified into straightforward named-references to commonly understood subjects, data-types, biological relationships, or biological properties. CardioSHARE makes it easy to compose complex queries, by hiding the complexity of the query and any analytical steps, from the user.

For example, a doctor may find from his studies of heart disease that a certain type of patient was more responsive to the drug Warfarin. He publishes this information in a manuscript, and simultaneously puts the same OWL definition of this type of patient on the Web as MyName:WarfarinHyperResponders. Another doctor, on seeing the publication could then immediately query her database for patients of type MyName:WarfarinHyperResponders’. This would allow her to check whether any of the patients in her pending drug-trials fall into this category, so that she could accommodate this new knowledge and amend her drug-trial protocol .

The structured approach to sharing expert knowledge made possible by SADI will forever change the way knowledge discovery is achieved. The simplification of complex query and analysis tasks, encourages more frequent and deeper exploration of existing data. This will, no doubt, result in a large number of fortuitous discoveries as knowledge that is buried deep in the Web is exposed.

www.tinyurl.com/ny7qqy

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-3) was last changed on 01-Jun-2009 14:07 by SimonHettrick [RSS]

© The University of Southampton on behalf of OMII-UK. All Rights Reserved. | Terms of Use | Privacy Policy | PageRank Checker