To do parallel analysis for pca or factor analysis you will need to download a program written by
ATS called **fapara**. You can get the program by typing the command,

`search fapara`

and then following the installation instructions.

Parallel analysis is a method for determining the number of components or factors to retain from pca or factor analysis. Essentially, the program works by creating a random dataset with the same numbers of observations and variables as the original data. A correlation matrix is computed from the randomly generated dataset and then eigenvalues of the correlation matrix are computed. When the eigenvalues from the random data are larger then the eigenvalues from the pca or factor analysis you known that the components or factors are mostly random noise.

We will demonstrate the use of the command **fapara** using a dataset from the Stata manual
called **bg2**. We will begin with a pca and follow that with a factor analysis.

After running the **factor** command we will run the **fapara** command with
the **pca** and **reps(10)** options. The **pca** option ensures that the
program obtains the eigenvalues from the correlation matrix without communality estimates
in the diagonal as you would find in factor analysis.

The **reps(10)** option indicates that the program will go through the process of
generating random datasets 10 times and will average the eigenvalues obtained from the
10 correlation matrices. You do not have to specify a large number of replications
to make this procedure work well. The eigenvalues of the random datasets to not
vary tremendously. Ten replicatons should be sufficient.

Principal components/correlation Number of obs = 568 Number of comp. = 6 Trace = 6 Rotation: (unrotated = principal) Rho = 1.0000 -------------------------------------------------------------------------- Component | Eigenvalue Difference Proportion Cumulative -------------+------------------------------------------------------------ Comp1 | 1.70622 .303339 0.2844 0.2844 Comp2 | 1.40288 .494225 0.2338 0.5182 Comp3 | .908652 .185673 0.1514 0.6696 Comp4 | .722979 .0560588 0.1205 0.7901 Comp5 | .66692 .074563 0.1112 0.9013 Comp6 | .592357 . 0.0987 1.0000 -------------------------------------------------------------------------- Principal components (eigenvectors) ---------------------------------------------------------------------------------------- Variable | Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 | Unexplained -------------+------------------------------------------------------------+------------- bg2cost1 | 0.2741 0.5302 -0.2712 -0.7468 -0.0104 -0.1111 | 0 bg2cost2 | -0.3713 0.4428 -0.4974 0.2800 0.2996 0.5005 | 0 bg2cost3 | -0.4077 0.4834 0.0656 0.2466 -0.5649 -0.4646 | 0 bg2cost4 | -0.3766 0.2748 0.7266 -0.2213 0.4504 0.0538 | 0 bg2cost5 | 0.4776 0.3345 0.3829 0.1950 -0.3942 0.5657 | 0 bg2cost6 | 0.5009 0.3192 0.0144 0.4647 0.4824 -0.4453 | 0 ----------------------------------------------------------------------------------------`webuse bg2 pca bg2cost1-bg2cost6`

PA -- Parallel Analysis for Principle Components PA Eigenvalues Averaged Over 10 Replications PCA PA Dif c1 1.7062 1.1366 0.5696 c2 1.4029 1.0637 0.3392 c3 0.9087 1.0343 -0.1257 c4 0.7230 0.9707 -0.2477 c5 0.6669 0.9269 -0.2600 c6 0.5924 0.8677 -0.2754`fapara, pca reps(10)`

The parallel analysis for this example indicates that two components should be retained. There are two ways to tell this; (1) two of the eigenvalues in the PCA column are greater than the average eigenvalues in the PA column, and (2) the dashed line for parallel analysis in the graph crosses the solid pca line before reaching the third component.

For the next example, we will run a factor analysis. This time we will run the
**fapara** command without the **pca** option because this is a
factor analysis. We will leave the number of replications at 10.

(obs=568) Factor analysis/correlation Number of obs = 568 Method: principal factors Retained factors = 3 Rotation: (unrotated) Number of params = 15 -------------------------------------------------------------------------- Factor | Eigenvalue Difference Proportion Cumulative -------------+------------------------------------------------------------ Factor1 | 0.85389 0.31282 1.0310 1.0310 Factor2 | 0.54107 0.51786 0.6533 1.6844 Factor3 | 0.02321 0.17288 0.0280 1.7124 Factor4 | -0.14967 0.03951 -0.1807 1.5317 Factor5 | -0.18918 0.06197 -0.2284 1.3033 Factor6 | -0.25115 . -0.3033 1.0000 -------------------------------------------------------------------------- LR test: independent vs. saturated: chi2(15) = 269.07 Prob>chi2 = 0.0000 Factor loadings (pattern matrix) and unique variances ----------------------------------------------------------- Variable | Factor1 Factor2 Factor3 | Uniqueness -------------+------------------------------+-------------- bg2cost1 | 0.2470 0.3670 -0.0446 | 0.8023 bg2cost2 | -0.3374 0.3321 -0.0772 | 0.7699 bg2cost3 | -0.3764 0.3756 0.0204 | 0.7169 bg2cost4 | -0.3221 0.1942 0.1034 | 0.8479 bg2cost5 | 0.4550 0.2479 0.0641 | 0.7274 bg2cost6 | 0.4760 0.2364 -0.0068 | 0.7175 -----------------------------------------------------------`factor bg2cost1-bg2cost6`

PA -- Parallel Analysis for Factor Analysis PA Eigenvalues Averaged Over 10 Replications FA PA Dif c1 0.8539 0.1488 0.7051 c2 0.5411 0.0882 0.4529 c3 0.0232 0.0256 -0.0023 c4 -0.1497 -0.0118 -0.1379 c5 -0.1892 -0.0707 -0.1184 c6 -0.2512 -0.1260 -0.1252`fapara, reps(10)`

The parallel analysis indicates that there are at least two factors with a possibility that there is a third factor because the eigenvalue for the third factor is very close in value to the average eighenvalue for the third random factor in the PA column. This also shows up in the graph where the parallel analysis dashed line crosses the solid factor analysis line right at three factors. >