Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison
The following tutorial walks through the identification of biological themes in two microarray datasets examining Huntington’s disease peripheral blood.
Before you begin, we recommend that you view the webinar. The top button on the right will take you to it if you haven't already.
When you are ready to begin this hands-on tutorial, click the third button down on the right, labeled "Log in to the dataset with tutorial."
1. Select Pairwise in the menu bar on the left side of the screen labeled Control Panel. You will see “Pairwise” under the heading marked “Analysis.”
2. Select the magnifying glass icon next to “Human UniSet 20K” in the list. This will examine the CodeLink data
3. At the top of the page is a list of the different experiment groups contained in this analysis. We’ll be comparing
expression between the healthy and HD samples. Select the 14 “Healthy” arrays to place them in Group 1.
4. Select the 12 “HD” samples for Group 2.
5. Pairwise analysis combines a fold-change cutoff with a quality filter and comparison statistics to generate a list of differentially expressed genes. Select the following settings:
Normalization: None
The data has been normalized so that each sample has a median signal intensity of 1, so no further normalization is required.
Statistics: t-test
Performs a two-sample, unpaired t-test for each gene that passes the quality and fold-change cutoffs.
Quality: .75
Filters out genes that received absent or marginal detection calls in both groups.
Threshold: Lower = 1.5; Upper = None.
Filters out genes with less than a 1.5 fold change in expression.
Correction: Benjamini and Hochberg
Calculates a false discovery rate from the raw p-values using the method of Benjamini and Hochberg.
Data transformation: Log Transform Data
This setting log base2 transforms the signal values.
6. Select the Analyze button.
7. At the top is a summary of the analysis just performed. The gene list below shows the genes that passed all our analysis parameters. By default, the
most differentially expressed genes are shown first.
8. To filter the list using the adjusted p-value (false discovery rate), select “adjusted p” from the pull-down menu and then click the Search button.
9.The list filtered on the adjusted p-value contains 438 genes with a false discovery rate less than 5% (all of the genes pass the more stringent cutoff of a false discovery rate less than 0.05).
10. To view data and a gene summary for any gene in the list, click the Gene Name. This will bring up a data summary and a One-Click Gene Summary™ (OCGS) for the gene. The One-Click Gene Summary provides a synopsis of current UniGene and Entrez Gene (formerly known as LocusLink) information for the gene.
11. Click on Pairwise again to bring you to the array selection screen.
12. Select the magnifying glass icon next to “ HG-U133A” in the list. This will examine the Affymetrix data
13. Again, select the 14 “Healthy” arrays to place them in Group 1.
14. Select the 12 “HD” samples for Group 2.
15. Select the following settings:
Normalization: None
The data has been normalized and scaled using MAS5, so no further normalization is required.
Statistics: t-test
Performs a two-sample, unpaired t-test for each gene that passes the quality and fold-change cutoffs.
Quality: 100
Filters out genes that received absent or marginal detection calls in both groups.
Threshold: Lower = 1.5; Upper = None.
Filters out genes with less than a 1.5 fold change in expression.
Correction: Benjamini and Hochberg
Calculates a false discovery rate from the raw p-values using the method of Benjamini and Hochberg.
Data transformation: Log Transform Data
This setting log base2 transforms the signal values.
16. Select the Analyze button.
17. To filter the list using the adjusted p-value (false discovery rate), select “adjusted p” from the pull-down menu and then click the Search button.
18. Again, to view data and a gene summary for any gene in the list, click the Gene Name, which will bring up the OCGS.
Only a few specific aspects of the data set have been explored here. Feel free to examine the data further on your own.