The following tutorial walks through the identification of biological themes associated with expression patterns using unsupervised clustering in a microarray dataset examining mouse heart development.
Before you begin, we recommend that you review the analysis summary. The top button of the four on the right will download it if you haven't already.
When you are ready to begin this hands-on tutorial, click the third button down on the right, labeled "Log in to the dataset with tutorial."
1. Select Projects in the menu bar on the left side of the screen labeled Control Panel. You will see “Projects” under the heading marked “Analysis.”
2. Click on the magnifying glass icon next to the FVB Development time course :: Filter project. This is a sub-project created from filtering the data within the FVB Development project. The initial analysis filtered the ~10,000 genes represented on the Affymetrix U74A GeneChip® based on the following parameters:
-applying ANOVA to the entire set
-applying a 5% false discovery rate cutoff
-filtering for a minimum 2-fold change cutoff.
The result of this analysis has provided a list of 1516 genes that was saved as the sub-project and will be analyzed in this tutorial.
3. At the top of this page is a Project Summary of the project data, describing the Array and the Conditions. For this tutorial, there is one Group containing 6 time points.
4. Select Cluster from the list of analysis options.
5. Unsupervised clustering can be performed on the filtered data by using the PAM section, setting the "Clusters" in the PAM section to 2 from the pull-down menu, and using the defaults for the other settings. Click the "Search" button. A graph showing the center for each cluster will be displayed.
6. The resulting clusters have a mean silhouette width of 0.598.
7. To view page-specific help documents for this or any page, select the question mark icon located at the upper right page corner. On this page, help documents are available for PAM and silhouettes (including the interpretation of the mean silhouette width).
8. To view the list of genes associated with a cluster, click on the graph. For this tutorial, select the graph for the cluster with 982 genes.
9. The resulting page shows the expression pattern of the 982 genes. The list of genes and their associated heat map over the 6 time points are displayed.
10. To view information for a particular gene, select the gene's title. Select one of the genes.
11. Gene summaries are available for all genes in the list. The upper half of the summary includes an overview of the data for a particular gene. This includes a summary of the averaged data for each condition, as well as the data for each of the replicates included in each condition.
12. The lower half of the summary is the One-Click Gene Summary™(OCGS). For the gene selected from the gene list, a synopsis displays the most current information from several databases, including UniGene and Entrez Gene (formerly called LocusLink). Links to additional databases and information are identified by blue text.
13. Summary information also includes the Gene Ontology terms associated with this particular gene product.
14. From the OCGS ("Analysis") window, click on the browser's "Back" button to return to return to the gene list for the cluster.
15. Select from options at the top of the gene list to perform additional analysis. Click on the Reports: Ontology link to view a summary of all the gene families in the cluster.
16. A new browser window will open containing the Ontology report.
17. Select the Z-score report in the upper right corner to list all significant ontology terms.
20. The z-score report lists all ontologies from the gene list with a z-score greater than 2 or less than -2.
21. Select the z-score link, above the far right column, to sort the list by z-score.
22. Select a gene list icon, in the column titled Genes, to expand the display and view a list of the genes (from the gene list), which have that ontology term. As before, you can then select a gene to display its One-Click Gene Summary window.
23. Gene Ontologies are subdivided based on functional distinctions. In the previous steps, ontologies associated with Biological Process have been examined. To examine genes associated with other functional categories, select either Cellular Component or Molecular Function from the top of the page.