Tutorial

Molecular changes in androgen-independent prostate cancer

The following tutorial walks through the identification of biological themes in a microarray dataset examining androgen-independent prostate cancer.

Before you begin, we recommend that you review the analysis summary. The top button of the four on the right will download it if you haven't already.

When you are ready to begin this hands-on tutorial, click the third button down on the right, labeled "Log in to the dataset with tutorial."

1. Select Pairwise in the menu bar on the left side of the screen labeled Control Panel. You will see “Pairwise” under the heading marked “Analysis.”

2. Select the magnifying glass icon next to “HG-U133A” in the list. The data presented here was generated using the Affymetrix® GeneChip® Human Genome U133A array. There are approximately ~22,000 transcripts represented on this array.

3. At the top of the page is a list of the different experiment groups contained in this analysis. We’ll be comparing expression between the androgen-dependent and -independent samples. Select the ten “Androgen-dependent” arrays to place them in Group 1.

4. Select the ten “Androgen-independent” arrays for Group 2.

5. Pairwise analysis combines a fold-change cutoff with a quality filter and comparison statistics to generate a list of differentially expressed genes. Select the following settings:

Normalization: None
Data was already normalized with GC-RMA during upload.

Statistics: Wilcoxon
Performs a Wilcoxon rank sum test for each gene that passes the fold change cutoff.

Quality: N/A
Quality calls are not generated by GC-RMA.
Check Exclude Controls.

Threshold: Lower = 1.5; Upper = None.
Filters out genes with less than a 1.5 fold change in expression.

Correction: Benjamini and Hochberg
Calculates a false discovery rate from the raw p-values using the method of Benjamini and Hochberg.

Data transformation: Data already log transformed
Data was logged during GC-RMA normalization.

6. Select the Analyze button.

7. After the analysis is performed a gene list will be returned. This list contains the genes that are differentially expressed based on the pairwise analysis setting selected. 785 genes passed the initial filtering criteria. The genes are sorted by fold change, and the first 50 genes in the list are displayed.

8. To filter the list using the adjusted p-value (false discovery rate), select “adjusted p” from the pull-down menu and then click the Search button.

9.The list filtered on the adjusted p value contains 468 genes with a false discovery rate less than 5%.

10. To view data and a gene summary for any gene in the list, click the Gene Name.

10. This will bring up a data summary and a One-Click Gene Summary™ (OCGS) for the gene. The One-Click Gene Summary provides a synopsis of current UniGene and Entrez Gene (formerly known as LocusLink) information for the gene.

11. Go back to the gene list by clicking the “Back” button in your browser.

12. Select the Ontology link at the top of the screen to view a summary of the Gene Ontology terms associated with the genes in the list. See the online help system for information about the other reports.

Note: To the view page-specific online help documents for any page, select the question mark icon located in the upper right corner of each page.

13. The Ontology Report lists the Gene Ontology terms associated with the 468 genes in the pairwise results gene list. See the help documents for this page for more information about the Ontology Report.

14. Click on Z-score report in the upper right corner of the Ontology Report window.

15. The z-score report lists the biological process ontologies that are significantly over or under-represented in the gene list (z-score greater than 2 or less than -2, respectively). Select the red arrow in the z-score column (on the right of the screen) to sort the list by z-score for the up-regulated genes.

Z-score reports can be generated for the Molecular Function and Cellular Component ontologies as well.

16. Minimize, move or close the z-score report window. Return back to the main analysis window, and click KEGG. This will bring up a new z-score report for the KEGG pathway terms associated with the differentially expressed genes.

17. In the KEGG column, click on the KEGG icon in the Ribosome row to show its KEGG pathway diagram. Differentially regulated genes are highlighted in red.

18. Return back to the main analysis window again and click Scatter Plot.

19. This will bring up a scatter plot of the results. Up-regulated genes are shown in red, and down-regulated genes are green. The gray spots are those that did not pass the analysis parameters. Move the blue box around and click Zoom to see more detail of the scatter plot.

20. Click on data points in the detail in the upper right section of your screen to bring up the gene summary for a specific gene.

Only a few specific aspects of the data set have been explored here. Feel free to examine the data further on your own.