# The Basic Mechanics of Principal Components Analysis

The following description gives an explanation of how principal components analysis can be computed. The actual algorithm described below is not used in any standard program, but the commonly-used algorithms can only be explained using mathematical concepts from linear algebra.^{[note 1]}

## Computing the first component

As discussed in on the main Principal Components Analysis page, PCA analyzes a Correlation Matrix and infers components that are consistent with the observed correlations.

Each component is created as a weighted sum of the existing variables. PCA starts by trying to find the single component which best explains the observed correlations^{[note 2]}between the variables.

Consider the following three variables:

v1 | v2 | v3 |
---|---|---|

1 | 1 | 1 |

2 | 3 | 5 |

3 | 2 | 2 |

4 | 5 | 3 |

5 | 4 | 4 |

The correlation matrix of the three variables is:

v1 | v2 | v3 | |
---|---|---|---|

v1 |
1.0 | .8 | .4 |

v2 |
.8 | 1.0 | .6 |

v3 |
.4 | .6 | 1.0 |

Note that there are moderate-to-strong correlations between all of the variables. Thus, any underling component must be correlated with all the variables. A first guess then is that our new component could simply be the sum of each of the existing variables:

The resulting component matrix, which shows the correlation between each of the variables and the computed component, is then:

Component | |
---|---|

v1 |
.856 |

v2 |
.934 |

v3 |
.778 |

These correlation are all very high and thus our estimated component is a pretty good component. However, it can be improved. Looking again at the correlation matrix, reproduced below again, we can deduce that our original guess of giving equal weights to the different components was a touch naïve. Note that **v2** has the highest average correlation with all the variables. Thus, if we were instead to give a higher weight to **v2** when estimating our component we will likely end up with marginally higher correlations with all the variables. Similarly, note that **v3** has the lowest average correlation and thus by the same argument it should be given a lower weight.

v1 | v2 | v3 | |
---|---|---|---|

v1 |
1.0 | .8 | .4 |

v2 |
.8 | 1.0 | .6 |

v3 |
.4 | .6 | 1.0 |

Using trial and error, we can deduce that the optimal formula^{[note 3]} for computing the component is:

Note that we have not multiplied **v1** by anything other than 1. This is because the numbers that are multiplied by the other variables are relative to **v1** having a weight of 1. If we were to put a weight other than 1 next to `v1` we would then have to multiple each of these other weights by thus number. For example, the following weights are the ones generated by SPSS (and shown in the Component Score Coefficient Matrix) and you can see that they relativities are the same:

## Computing the remaining components

The next component is computed as follows:

- Regression is used to predict each variable based on its component.
- The residuals of the regression model are then computed.
- The correlation matrix is computed using the residuals.
- The same basic process as described above is performed to create a second component.
- These steps are then repeated until the number of components is equal to the number of variables.
^{[note 4]}

## Rotation

Typically, Varimax Rotation is performed to aid interpretation.

## Notes

- ↑ The components are computed using a
*Singular Value Decomposition* - ↑ It is also possible to best explain the observed covariances instead of correlations
- ↑ Technically, principal components analysis maximizes the variance explained rather than the average correlation, but the difference is not important for the purpose of this explanation
- ↑ Unless some of the variables are
*linearly dependent*, in which case the number of components computed is equivalent to the*rank*of the variables.