A better prognosis thanks to the sharing of genomic data
by Mark Delderfield, Microsoft Shared GenomicsPrevious article: Open Source: the next generation
Next article: Massive Growth in Chinese freight needs a grid solution
The shared genomics project offers a clinician the capability to analyse large‐scale and complex genetic data using Taverna and High Performance Computing. It performs calculations which detail the statistical likelihood that a Single Nucleotide Polymorphisms (a single DNA sequence) is responsible for a particular disease, and annotates the results to enable the clinician to make a better assessment and weed out any false positives. Funding from the Engage project has led to a re-design of the Taverna workflow engine to handle multiple concurrent users.
A number of workflows for generating relevant annotations have already been developed using Taverna. Rather than replicating these workflows, the Shared Genomics project, based at the Northwest Institute for BioHealth Informatics, will re-use them by integrating the Taverna workflow engine into their platform. Any workflows written by Bioinformaticians could then be deployed within the Shared Genomics workbench with minimal involvement from Software Engineers.
The version of Taverna originally used by the project, version 1.7, could only run one workflow at a time, and spawned a new process for each received call, which added a several‐second overhead and used a large amount of memory. After securing support from the OMII‐UK Engage programme, the myGrid team redesigned the Taverna workflow engine to handle multiple users concurrently as a web service.
The completion of the ENGAGE work has provided third-party applications, such as the Shared Genomics workbench, with an underlying workflow-enactment service that is scalable and efficient over a large volume of concurrent workflow calls. This opens up the possibility for researchers to use the rapidly expanding base of Taverna workflows from within the myExperiment project to retrieve up-to-date bioinformatic data. Shared Genomics provides clinical researchers with the additional information they need to make a prognosis about the cause of a particular disease trait.





© The University of Southampton on behalf of OMII-UK. All Rights Reserved. |