OMII-UK Home

Workflows for Graves' Disease Scenario using Taverna

Software used:
Taverna : Workflow environment of myGrid.
GRIMOIRES : Registry which hosts descriptions of services and workflows.

The aim of this scenario is to identify and characterise genes which are located in regions on human chromosomes which show linkage to Graves disease (GD) (Fig. 1). GD is an autoimmune disease of the thyroid in which the immune system of an individual attacks cells in the thyroid gland resulting in hyperthyroidism. This is caused by the stimulation of the thyrotrophin receptor by thyroid-stimulating autoantibodies secreted by lymphocytes of the immune system.

Figure 1
Figure 1

Affymetrix microarray studies

The GD candidate genes were identified by microarray analysis. Affymetrix U95A arrays were probed with RNA extracted from CD4 positive lymphocytes from four GD patients and four healthy controls. The four GD microarray datasets were then compared to the four control datasets using the Affymetrix data mining tool to identify differentially expressed genes.

Annotation Pipeline

Over 50 genes were found to be differentially-expressed in CD4 positive lymphocytes from GD patients. In order to understand why these genes were expressed in lymphocytes from GD patients but not in healthy individuals, the GD biologist would like to use myGrid to query public databases such as EMBL, GO, HGVBASE and MEDLINE to view information about gene structure and function, chromosome location, the presence of single nucleotide polymorphisms (SNPs), expression control features and association with other genetic diseases. The experimental conditions and diseases in which the expression of the candidate genes are significantly altered also need to be identified from OMIM.

Genotype Assay Design System

SNPs are small (single base pair) changes genetic variations which are found in the genome amongst individuals. The differential expression of the candidate genes in GD individuals may be due or related to the presence of SNPs associated with GD. The GD biologist is interested in identifying and determining the frequency of those SNPs which are found in her GD patients.

Restriction fragment length polymorphism (RFLP) assays are developed to genotype SNPs in her candidate genes. A region flanking either side of the SNP is amplified using polymerase chain reaction (PCR). The amplified PCR product is digested with a suitable restriction enzyme (i.e. one that will cut at one SNP allele and not the other) and the products are run on agarose gels to view product size and determine the genotype.

The GD biologist would like to use myGrid to:

  1. Query databases to retrieve SNP information associated with candidate genes.
  2. Aid in the design of primers (bits of DNA which signify the start and end points of the section of the DNA sequence which she wants to amplify) for the PCR experiment.
  3. Select the restriction enzyme that is specific to a particular SNP for the RFLP experiment.
3D Protein Structure & effect of coding SNP on protein active site

Any SNPs occurring in the coding regions of a candidate gene may potentially give rise to a change in the amino acid sequence of the protein encoded by the gene.

The GD biologist would like to use myGrid to:

  1. Query a protein structure database, e.g. PDB or MSD, to determine whether a structure of the protein encoded by her candidate gene is available. If so, view the protein structure to study how it relates to the function of the protein.
  2. Obtain information about the protein, e.g. its function and functional domains, by querying SWISS-PROT and InterPro.
  3. Use Sheffield's AMBIT web service to retrieve information about an active site whose characteristics may be altered due to the presence of a coding SNP which has affected a change in the amino acid sequence of the protein where the active site is encoded.
Papers / Documents

Performing in silico Experiments on the Grid: A Users Perspective - Robert Stevens, Kevin Glover, Chris Greenhalgh, Claire Jennings, Simon Pearce, Peter Li, Melena Radenkovic, Anil Wipat in Proceedings UK e-Science All Hands Meeting 2003 Editors - Simon J Cox, 43-50 ISBN - 1-904425-11-9, September 2003

e-Science and the Grid are not the same; the large-scale movement of data and the exploitation of computation is not the same as the creation, performance and management of an in silico experiment. The notion of the marshalling of resources and creation of virtual organisations begins to bring in a flavour of science, but something more is needed over and above the classic Grid to enable e-Science. This paper looks at the requirements of e-Science from the user's perspective. The myGrid project aims to provide a toolkit of services that comprise the Information Grid and the applications that sit there upon. The aim is to provide a set of services that have the facilities to enable bioinformaticians (in particular) to perform in silico experiments using applications built upon components from a Grid enabled middleware layer. This paper introduces the myGrid project and explores the nature of an in silico experiment for the bioinformatics domain. The paper then reviews the general user requirements for an empirical e-Scientist. We then introduce a biological scenario, where bench experiments are coupled to in silico experiments, which we have used to drive the user requirements capture in myGrid. Then, the myGrid workbench, an application that demonstrates the functionality of myGrid is reviewed. Finally, we match the current status of myGrid to our general requirements and explore how we can use the current implementation to drive the capture of further, more detailed user requirements.

View

Further details on research using myGrid can be found here

© The University of Southampton on behalf of OMII-UK. All Rights Reserved. | Terms of Use | Privacy Policy | PageRank Checker