Organism(s): Homo sapiens
Microarray quality metrics report for E-GEOD-18842 on array design A-AFFY-44
- Section 1: Between array comparison
- Distances between arrays
- Principal Component Analysis
Browser compatibilityThis report uses recent features of HTML 5. Functionality has been tested on these browsers: Firefox 10, Chrome 17, Safari 5.1.2
+ Array metadata and outlier detection overview
- Figure 1: Distances between arrays.
Figure 1 (PDF file) shows a false color heatmap of the distances between arrays. The color scale is chosen to cover the range of distances encountered in the dataset. Patterns in this plot can indicate clustering of the arrays either because of intended biological or unintended experimental factors (batch effects). The distance dab between two arrays a and b is computed as the mean absolute difference (L1-distance) between the data of the arrays (using the data from all probes without filtering). In formula, dab = mean | Mai - Mbi |, where Mai is the value of the i-th probe on the a-th array. Outlier detection was performed by looking for arrays for which the sum of the distances to all other arrays, Sa = Σb dab was exceptionally large. One such array was detected, and it is marked by an asterisk, *.
- Figure 3: Principal Component Analysis.
Figure 3 (PDF file) shows a scatterplot of the arrays along the first two principal components. You can use this plot to explore if the arrays cluster, and whether this is according to an intended experimental factor, or according to unintended causes such as batch effects. Move the mouse over the points to see the sample names.
Principal component analysis is a dimension reduction and visualisation technique that is here used to project the multivariate data vector of each array into a two-dimensional plot, such that the spatial arrangement of the points in the plot reflects the overall data (dis)similarity between the arrays.
Note: the figure is static - enhancement with interactive effects failed. This is either due to a version incompatibility of the 'SVGAnnotation' R package and your version of 'Cairo' or 'libcairo', or due to plot misformating. Please consult the Bioconductor mailing list, or contact the maintainer of 'arrayQualityMetrics' with a reproducible example in order to fix this problem.
- Figure 4: Boxplots.
Figure 4 (PDF file) shows boxplots representing summaries of the signal intensity distributions of the arrays. Each box corresponds to one array. Typically, one expects the boxes to have similar positions and widths. If the distribution of an array is very different from the others, this may indicate an experimental problem. Outlier detection was performed by computing the Kolmogorov-Smirnov statistic Ka between each array's distribution and the distribution of the pooled data.
- Figure 6: Density plots.