Subject: Statistics
Paper: Multivariate Analysis
Module: Principal Components Analysis -
Development Team
Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta
Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad
Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta
Development Team
Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta
Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad
Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta
Development Team
Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta
Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad
Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta
Development Team
Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta
Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad
Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta
Principal Components from Standardized Variables
I Let X= [X1, X2, . . . , Xm]0 have mean µ= [µ1, µ2, . . . , µm]0 and dispersion matrixΣ = ((σjj))with eigenvalues
λ1≥λ2 ≥ · · · ≥λm≥0.
Define the standardized variables Zj = (Xj−µj)
√σjj , j= 1, . . . , m.
I Fot Z= [Z1, Z2, . . . , Zm]0 andV, a diagonal matrix with elements σjj,
Z=V−1/2(X−µ)
I Clearly E(Z) = 0and
Cov(Z) =V−1/2ΣV−1/2 =ρ.
Principal Components from Standardized Variables
I Let X= [X1, X2, . . . , Xm]0 have mean µ= [µ1, µ2, . . . , µm]0 and dispersion matrixΣ = ((σjj))with eigenvalues
λ1≥λ2 ≥ · · · ≥λm≥0.
Define the standardized variables Zj = (Xj−µj)
√σjj , j= 1, . . . , m.
I Fot Z= [Z1, Z2, . . . , Zm]0 andV, a diagonal matrix with elements σjj,
Z=V−1/2(X−µ)
I Clearly E(Z) = 0and
Cov(Z) =V−1/2ΣV−1/2 =ρ.
Principal Components from Standardized Variables
I Let X= [X1, X2, . . . , Xm]0 have mean µ= [µ1, µ2, . . . , µm]0 and dispersion matrixΣ = ((σjj))with eigenvalues
λ1≥λ2 ≥ · · · ≥λm≥0.
Define the standardized variables Zj = (Xj−µj)
√σjj , j= 1, . . . , m.
I Fot Z= [Z1, Z2, . . . , Zm]0 andV, a diagonal matrix with elements σjj,
Z=V−1/2(X−µ)
I Clearly E(Z) = 0and
Cov(Z) =V−1/2ΣV−1/2 =ρ.
Standardized Variables
I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.
I All the previous results will be valid since the variance of each Zi is unity.
I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.
I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.
Standardized Variables
I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.
I All the previous results will be valid since the variance of each Zi is unity.
I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.
I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.
Standardized Variables
I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.
I All the previous results will be valid since the variance of each Zi is unity.
I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.
I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.
Standardized Variables
I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.
I All the previous results will be valid since the variance of each Zi is unity.
I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.
I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.
Standardized Variables
I Result 4: The jth principal component of the standardized variables Z0 = [Z1, Z2, . . . , Zm]with Cov(Z) =ρ, is given by,
Yj =ej0
Z=ej0
V−1/2(X−µ), j= 1,2, . . . , m Moreover,
m
X
j=1
Var(Yj) =
m
X
j=1
Var(Zj) =m.
and
ρYj,Zk =ejkλj j, k= 1,2, . . . , m.
In this case (λ1,e1),(λ2,e2), . . . ,(λm,em) are the eigenvalue-eigenvector pairs of ρ, with
Standardized Variables
I The proof of Result 4 follows from Results 1, 2 and 3 with Z1, Z2, . . . , Zm in place of X1, X2, . . . , Xm andρin place of Σ.
I We know that the total population variance (for the
standardized variables) is simply m, the sum of the diagonal elements of the matrix ρ.
I Therefore, Proportion of standardized population variance due to the jth principal component
=λj/m, j = 1,2, . . . , m, whereλj’s are the eigenvalues ofρ.
Standardized Variables
I The proof of Result 4 follows from Results 1, 2 and 3 with Z1, Z2, . . . , Zm in place of X1, X2, . . . , Xm andρin place of Σ.
I We know that the total population variance (for the
standardized variables) is simply m, the sum of the diagonal elements of the matrix ρ.
I Therefore, Proportion of standardized population variance due to the jth principal component
=λj/m, j = 1,2, . . . , m, whereλj’s are the eigenvalues ofρ.
Standardized Variables
I The proof of Result 4 follows from Results 1, 2 and 3 with Z1, Z2, . . . , Zm in place of X1, X2, . . . , Xm andρin place of Σ.
I We know that the total population variance (for the
standardized variables) is simply m, the sum of the diagonal elements of the matrix ρ.
I Therefore, Proportion of standardized population variance due to the jth principal component
=λj/m, j = 1,2, . . . , m, whereλj’s are the eigenvalues ofρ.
Geometrical Interpretation
I SupposeX is distributed as Nm(µ,Σ). Then the density of X is constant on the µcentered ellipsoids,
(x−µ)0Σ−1(x−µ) =c2 . . .(∗)
I This have axes ±cp
λjej, j = 1,2, . . . , m, where the (λj,ej) are the eigenvalue-eigenvector pairs ofΣ.
I A point lying on the jth axis of the ellipsoid will have coordinates proportional toej = [ej1, ej2, . . . , ejm]in the coordinate system that has originµand axes that are parallel to the original axesx1, x2, . . . , xm.
Geometrical Interpretation
I SupposeX is distributed as Nm(µ,Σ). Then the density of X is constant on the µcentered ellipsoids,
(x−µ)0Σ−1(x−µ) =c2 . . .(∗)
I This have axes ±cp
λjej, j = 1,2, . . . , m, where the (λj,ej) are the eigenvalue-eigenvector pairs ofΣ.
I A point lying on the jth axis of the ellipsoid will have coordinates proportional toej = [ej1, ej2, . . . , ejm]in the coordinate system that has originµand axes that are parallel to the original axesx1, x2, . . . , xm.
Geometrical Interpretation
I SupposeX is distributed as Nm(µ,Σ). Then the density of X is constant on the µcentered ellipsoids,
(x−µ)0Σ−1(x−µ) =c2 . . .(∗)
I This have axes ±cp
λjej, j = 1,2, . . . , m, where the (λj,ej) are the eigenvalue-eigenvector pairs ofΣ.
I A point lying on the jth axis of the ellipsoid will have coordinates proportional toej = [ej1, ej2, . . . , ejm]in the coordinate system that has originµand axes that are parallel to the original axesx1, x2, . . . , xm.
Geometrical Interpretation
I Settingµ= 0in (∗) we can write, c2 =x0Σ−1x= 1
λ1
(e10
x)2+ 1 λ2
(e20
x)2+· · ·+ 1 λm
(em0
x)2 wheree10x,e20x, . . . ,em0x are recognized as the principal components ofx.
I Settingy1 =e10x, y2=e20x, . . . , ym =em0x, we have c2= 1
λ1
y21+ 1 λ2
y22+· · ·+ 1 λm
y2m
I This equation defines an ellipsoid (since λ1, λ2, . . . , λm are positive) in a coordinate system with axes y1, y2, . . . , ym lying
Geometrical Interpretation
I Settingµ= 0in (∗) we can write, c2 =x0Σ−1x= 1
λ1
(e10
x)2+ 1 λ2
(e20
x)2+· · ·+ 1 λm
(em0
x)2 wheree10x,e20x, . . . ,em0x are recognized as the principal components ofx.
I Settingy1 =e10x, y2=e20x, . . . , ym =em0x, we have c2= 1
λ1
y21+ 1 λ2
y22+· · ·+ 1 λm
y2m
I This equation defines an ellipsoid (since λ1, λ2, . . . , λm are positive) in a coordinate system with axes y1, y2, . . . , ym lying
Geometrical Interpretation
I Settingµ= 0in (∗) we can write, c2 =x0Σ−1x= 1
λ1
(e10
x)2+ 1 λ2
(e20
x)2+· · ·+ 1 λm
(em0
x)2 wheree10x,e20x, . . . ,em0x are recognized as the principal components ofx.
I Settingy1 =e10x, y2=e20x, . . . , ym =em0x, we have c2= 1
λ1
y21+ 1 λ2
y22+· · ·+ 1 λm
y2m
I This equation defines an ellipsoid (since λ1, λ2, . . . , λm are positive) in a coordinate system with axes y1, y2, . . . , ym lying
Geometrical Interpretation
I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.
I The remaining minor axes lies in the directions defined by e2, . . . ,em.
I The principal components
y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.
I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily
principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].
I Whenµ6= 0, it is the mean-centered principal component
Geometrical Interpretation
I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.
I The remaining minor axes lies in the directions defined by e2, . . . ,em.
I The principal components
y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.
I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily
principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].
I Whenµ6= 0, it is the mean-centered principal component
Geometrical Interpretation
I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.
I The remaining minor axes lies in the directions defined by e2, . . . ,em.
I The principal components
y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.
I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily
principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].
I Whenµ6= 0, it is the mean-centered principal component
Geometrical Interpretation
I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.
I The remaining minor axes lies in the directions defined by e2, . . . ,em.
I The principal components
y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.
I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily
principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].
I Whenµ6= 0, it is the mean-centered principal component
Geometrical Interpretation
I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.
I The remaining minor axes lies in the directions defined by e2, . . . ,em.
I The principal components
y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.
I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily
principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].
I Whenµ6= 0, it is the mean-centered principal component
Special Covariance Structures
I Case I : Suppose Σbe the diagonal matrix
Σ =
σ11 0 . . . 0
0 σ22 . . . 0 ... ... . .. ...
0 0 . . . σmm
. . .(i)
I Settingej = [0, . . . ,0,1,0, . . . ,0], with 1 in the jth position, we observe that
σ11 0 . . . 0
0 σ22 . . . 0
... ... . .. ...
0 0 . . . σmm
0
... 0 1 0 ..
=
0
... 0 σjj
0 ..
or Σej =σjjej
Special Covariance Structures
I Case I : Suppose Σbe the diagonal matrix
Σ =
σ11 0 . . . 0
0 σ22 . . . 0 ... ... . .. ...
0 0 . . . σmm
. . .(i)
I Settingej = [0, . . . ,0,1,0, . . . ,0], with 1 in the jth position, we observe that
σ11 0 . . . 0
0 σ22 . . . 0 ... ... . .. ...
0 0 . . . σmm
0
... 0 1 0 ..
=
0
... 0 σjj
0 ..
or Σej =σjjej
Special Covariance Structures
I (σjj,ej) is the jth eigenvalue-eigenvector pair.
I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.
I For a covariance matrix like(i) nothing is gained by extracting the principal components.
I IfX∼Nm(µ,Σ), the contours of constant density are
ellipsoids whose axes already lie in the directions of maximum variation.
I Consequently there is no need to rotate the coordinate system.
Special Covariance Structures
I (σjj,ej) is the jth eigenvalue-eigenvector pair.
I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.
I For a covariance matrix like(i) nothing is gained by extracting the principal components.
I IfX∼Nm(µ,Σ), the contours of constant density are
ellipsoids whose axes already lie in the directions of maximum variation.
I Consequently there is no need to rotate the coordinate system.
Special Covariance Structures
I (σjj,ej) is the jth eigenvalue-eigenvector pair.
I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.
I For a covariance matrix like(i) nothing is gained by extracting the principal components.
I IfX∼Nm(µ,Σ), the contours of constant density are
ellipsoids whose axes already lie in the directions of maximum variation.
I Consequently there is no need to rotate the coordinate system.
Special Covariance Structures
I (σjj,ej) is the jth eigenvalue-eigenvector pair.
I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.
I For a covariance matrix like(i) nothing is gained by extracting the principal components.
I IfX∼Nm(µ,Σ), the contours of constant density are
ellipsoids whose axes already lie in the directions of maximum variation.
I Consequently there is no need to rotate the coordinate system.
Special Covariance Structures
I (σjj,ej) is the jth eigenvalue-eigenvector pair.
I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.
I For a covariance matrix like(i) nothing is gained by extracting the principal components.
I IfX∼Nm(µ,Σ), the contours of constant density are
ellipsoids whose axes already lie in the directions of maximum variation.
I Consequently there is no need to rotate the coordinate system.
Special Covariance Structures
I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.
I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are
convenient choices for the eigenvectors.
I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.
I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.
Special Covariance Structures
I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.
I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are
convenient choices for the eigenvectors.
I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.
I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.
Special Covariance Structures
I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.
I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are
convenient choices for the eigenvectors.
I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.
I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.
Special Covariance Structures
I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.
I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are
convenient choices for the eigenvectors.
I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.
I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.
Special Covariance Structures
I Case II : Consider another covariance matrix, that often describes the correspondence among certain biological variables (e.g. size of living organisms)
Σ =
σ2 ρσ2 . . . ρσ2 ρσ2 σ2 . . . ρσ2 ... ... . .. ... ρσ2 ρσ2 . . . σ2
. . .(ii)
The resulting correlation matrix
ρ=
1 ρ . . . ρ ρ 1 . . . ρ ... ... . .. ...
. . .(iii)
Special Covariance Structures
I The matrix in (iii) implies that the variablesX1, X2, . . . , Xm are equally correlated.
I The meigenvalues of the correlation matrix can be divided into two groups. When ρ is positive the largest is
λ1 = 1 + (m−1)ρ . . .(iv)
with associated eigenvector,
e10 = 1
√m, 1
√m,· · ·, 1
√m
. . .(v) The remaining m−1 eigenvalues are
λ1=λ2 =· · ·=λp = 1−ρ
Special Covariance Structures
I The matrix in (iii) implies that the variablesX1, X2, . . . , Xm are equally correlated.
I The meigenvalues of the correlation matrix can be divided into two groups. When ρ is positive the largest is
λ1 = 1 + (m−1)ρ . . .(iv)
with associated eigenvector,
e10 = 1
√m, 1
√m,· · ·, 1
√m
. . .(v) The remaining m−1 eigenvalues are
λ1=λ2 =· · ·=λp = 1−ρ
Special Covariance Structures
I The matrix in (iii) implies that the variablesX1, X2, . . . , Xm are equally correlated.
I The meigenvalues of the correlation matrix can be divided into two groups. When ρ is positive the largest is
λ1 = 1 + (m−1)ρ . . .(iv)
with associated eigenvector,
e10 = 1
√m, 1
√m,· · ·, 1
√m
. . .(v) The remaining m−1 eigenvalues are
λ1=λ2 =· · ·=λp = 1−ρ
Special Covariance Structures
I One choice for these eigenvectors is e20 =
h√1
1×2,√−11×2,0,· · ·,0 i
e30 = h√1
2×3,√2×31 ,√−22×3,0,· · ·,0 i
... ...
ej0 =
√ 1
(j−1)j,· · · ,√ 1
(j−1)j,√−(j−1)
(j−1)j,0,· · · ,0
... ...
em0 =
√ 1
(m−1)m,· · · ,√ 1
(m−1)m,√−(m−1)
(m−1)m
Special Covariance Structures
I The first principal component,
Y1=e10Z= 1
√m
m
X
j=1
Zj
is proportional to the sum of the m standardized variables.
I Might be regarded as an index with equal weights.
I This principal component explains a proportion λ1
m = 1 + (m−1)ρ
m =ρ+ 1−ρ
m . . .(vi)
of the total population variation.
Special Covariance Structures
I The first principal component,
Y1=e10Z= 1
√m
m
X
j=1
Zj
is proportional to the sum of the m standardized variables.
I Might be regarded as an index with equal weights.
I This principal component explains a proportion λ1
m = 1 + (m−1)ρ
m =ρ+ 1−ρ
m . . .(vi)
of the total population variation.
Special Covariance Structures
I The first principal component,
Y1=e10Z= 1
√m
m
X
j=1
Zj
is proportional to the sum of the m standardized variables.
I Might be regarded as an index with equal weights.
I This principal component explains a proportion λ1
m = 1 + (m−1)ρ
m =ρ+ 1−ρ
m . . .(vi)
of the total population variation.
Special Covariance Structures
I λ1/m≈ρ for ρ close to 1 orm .
For example, ifρ= 0.80 andm= 5, the first component explains 84%of the total variability.
I Whenρ is near 1, a very small proportion of total variance is explained by the lastm−1components.
I In this case we retain only the first principal componentY1.
Special Covariance Structures
I λ1/m≈ρ for ρ close to 1 orm .
For example, ifρ= 0.80 andm= 5, the first component explains 84%of the total variability.
I Whenρ is near 1, a very small proportion of total variance is explained by the lastm−1components.
I In this case we retain only the first principal componentY1.
Special Covariance Structures
I λ1/m≈ρ for ρ close to 1 orm .
For example, ifρ= 0.80 andm= 5, the first component explains 84%of the total variability.
I Whenρ is near 1, a very small proportion of total variance is explained by the lastm−1components.
I In this case we retain only the first principal componentY1.
Scree Plot
I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.
I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.
I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.
I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.
I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about
Scree Plot
I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.
I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.
I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.
I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.
I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about
Scree Plot
I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.
I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.
I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.
I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.
I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about
Scree Plot
I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.
I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.
I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.
I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.
I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about
Scree Plot
I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.
I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.
I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.
I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.
I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about
Summary
I The principal components of standardized variables are looked into.
I The geometry of principal components is considered.
I Principal components of variables with some special covariance structures are discussed.
Summary
I The principal components of standardized variables are looked into.
I The geometry of principal components is considered.
I Principal components of variables with some special covariance structures are discussed.
Summary
I The principal components of standardized variables are looked into.
I The geometry of principal components is considered.
I Principal components of variables with some special covariance structures are discussed.