• No results found

Module: Principal Components Analysis -

N/A
N/A
Protected

Academic year: 2022

Share "Module: Principal Components Analysis -"

Copied!
57
0
0

Loading.... (view fulltext now)

Full text

(1)

Subject: Statistics

Paper: Multivariate Analysis

Module: Principal Components Analysis -

(2)

Development Team

Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta

Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad

Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta

(3)

Development Team

Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta

Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad

Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta

(4)

Development Team

Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta

Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad

Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta

(5)

Development Team

Principal investigator: Dr. Bhaswati Ganguli, Professor, Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Sugata SenRoy,Professor, Department of Statistics, University of Calcutta

Content writer: Souvik Bandyopadhyay, Senior Lecturer, Indian Institute of Public Health, Hyderabad

Content reviewer: Dr. Kalyan Das,Professor, Department of Statistics, University of Calcutta

(6)

Principal Components from Standardized Variables

I Let X= [X1, X2, . . . , Xm]0 have mean µ= [µ1, µ2, . . . , µm]0 and dispersion matrixΣ = ((σjj))with eigenvalues

λ1≥λ2 ≥ · · · ≥λm≥0.

Define the standardized variables Zj = (Xj−µj)

√σjj , j= 1, . . . , m.

I Fot Z= [Z1, Z2, . . . , Zm]0 andV, a diagonal matrix with elements σjj,

Z=V−1/2(X−µ)

I Clearly E(Z) = 0and

Cov(Z) =V−1/2ΣV−1/2 =ρ.

(7)

Principal Components from Standardized Variables

I Let X= [X1, X2, . . . , Xm]0 have mean µ= [µ1, µ2, . . . , µm]0 and dispersion matrixΣ = ((σjj))with eigenvalues

λ1≥λ2 ≥ · · · ≥λm≥0.

Define the standardized variables Zj = (Xj−µj)

√σjj , j= 1, . . . , m.

I Fot Z= [Z1, Z2, . . . , Zm]0 andV, a diagonal matrix with elements σjj,

Z=V−1/2(X−µ)

I Clearly E(Z) = 0and

Cov(Z) =V−1/2ΣV−1/2 =ρ.

(8)

Principal Components from Standardized Variables

I Let X= [X1, X2, . . . , Xm]0 have mean µ= [µ1, µ2, . . . , µm]0 and dispersion matrixΣ = ((σjj))with eigenvalues

λ1≥λ2 ≥ · · · ≥λm≥0.

Define the standardized variables Zj = (Xj−µj)

√σjj , j= 1, . . . , m.

I Fot Z= [Z1, Z2, . . . , Zm]0 andV, a diagonal matrix with elements σjj,

Z=V−1/2(X−µ)

I Clearly E(Z) = 0and

Cov(Z) =V−1/2ΣV−1/2 =ρ.

(9)

Standardized Variables

I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.

I All the previous results will be valid since the variance of each Zi is unity.

I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.

I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.

(10)

Standardized Variables

I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.

I All the previous results will be valid since the variance of each Zi is unity.

I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.

I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.

(11)

Standardized Variables

I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.

I All the previous results will be valid since the variance of each Zi is unity.

I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.

I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.

(12)

Standardized Variables

I The principal components ofZmay be obtained from the eigenvectors of the correlation matrix ρof X.

I All the previous results will be valid since the variance of each Zi is unity.

I Yj will be used to denote thejth principal component and (λj,ej) for the eigenvalue-eigenvector pair from eitherρ orΣ.

I However, the (λj,ej)derived from Σare in general not the same as the ones derived from ρ.

(13)

Standardized Variables

I Result 4: The jth principal component of the standardized variables Z0 = [Z1, Z2, . . . , Zm]with Cov(Z) =ρ, is given by,

Yj =ej0

Z=ej0

V−1/2(X−µ), j= 1,2, . . . , m Moreover,

m

X

j=1

Var(Yj) =

m

X

j=1

Var(Zj) =m.

and

ρYj,Zk =ejkλj j, k= 1,2, . . . , m.

In this case (λ1,e1),(λ2,e2), . . . ,(λm,em) are the eigenvalue-eigenvector pairs of ρ, with

(14)

Standardized Variables

I The proof of Result 4 follows from Results 1, 2 and 3 with Z1, Z2, . . . , Zm in place of X1, X2, . . . , Xm andρin place of Σ.

I We know that the total population variance (for the

standardized variables) is simply m, the sum of the diagonal elements of the matrix ρ.

I Therefore, Proportion of standardized population variance due to the jth principal component

j/m, j = 1,2, . . . , m, whereλj’s are the eigenvalues ofρ.

(15)

Standardized Variables

I The proof of Result 4 follows from Results 1, 2 and 3 with Z1, Z2, . . . , Zm in place of X1, X2, . . . , Xm andρin place of Σ.

I We know that the total population variance (for the

standardized variables) is simply m, the sum of the diagonal elements of the matrix ρ.

I Therefore, Proportion of standardized population variance due to the jth principal component

j/m, j = 1,2, . . . , m, whereλj’s are the eigenvalues ofρ.

(16)

Standardized Variables

I The proof of Result 4 follows from Results 1, 2 and 3 with Z1, Z2, . . . , Zm in place of X1, X2, . . . , Xm andρin place of Σ.

I We know that the total population variance (for the

standardized variables) is simply m, the sum of the diagonal elements of the matrix ρ.

I Therefore, Proportion of standardized population variance due to the jth principal component

j/m, j = 1,2, . . . , m, whereλj’s are the eigenvalues ofρ.

(17)

Geometrical Interpretation

I SupposeX is distributed as Nm(µ,Σ). Then the density of X is constant on the µcentered ellipsoids,

(x−µ)0Σ−1(x−µ) =c2 . . .(∗)

I This have axes ±cp

λjej, j = 1,2, . . . , m, where the (λj,ej) are the eigenvalue-eigenvector pairs ofΣ.

I A point lying on the jth axis of the ellipsoid will have coordinates proportional toej = [ej1, ej2, . . . , ejm]in the coordinate system that has originµand axes that are parallel to the original axesx1, x2, . . . , xm.

(18)

Geometrical Interpretation

I SupposeX is distributed as Nm(µ,Σ). Then the density of X is constant on the µcentered ellipsoids,

(x−µ)0Σ−1(x−µ) =c2 . . .(∗)

I This have axes ±cp

λjej, j = 1,2, . . . , m, where the (λj,ej) are the eigenvalue-eigenvector pairs ofΣ.

I A point lying on the jth axis of the ellipsoid will have coordinates proportional toej = [ej1, ej2, . . . , ejm]in the coordinate system that has originµand axes that are parallel to the original axesx1, x2, . . . , xm.

(19)

Geometrical Interpretation

I SupposeX is distributed as Nm(µ,Σ). Then the density of X is constant on the µcentered ellipsoids,

(x−µ)0Σ−1(x−µ) =c2 . . .(∗)

I This have axes ±cp

λjej, j = 1,2, . . . , m, where the (λj,ej) are the eigenvalue-eigenvector pairs ofΣ.

I A point lying on the jth axis of the ellipsoid will have coordinates proportional toej = [ej1, ej2, . . . , ejm]in the coordinate system that has originµand axes that are parallel to the original axesx1, x2, . . . , xm.

(20)

Geometrical Interpretation

I Settingµ= 0in (∗) we can write, c2 =x0Σ−1x= 1

λ1

(e10

x)2+ 1 λ2

(e20

x)2+· · ·+ 1 λm

(em0

x)2 wheree10x,e20x, . . . ,em0x are recognized as the principal components ofx.

I Settingy1 =e10x, y2=e20x, . . . , ym =em0x, we have c2= 1

λ1

y21+ 1 λ2

y22+· · ·+ 1 λm

y2m

I This equation defines an ellipsoid (since λ1, λ2, . . . , λm are positive) in a coordinate system with axes y1, y2, . . . , ym lying

(21)

Geometrical Interpretation

I Settingµ= 0in (∗) we can write, c2 =x0Σ−1x= 1

λ1

(e10

x)2+ 1 λ2

(e20

x)2+· · ·+ 1 λm

(em0

x)2 wheree10x,e20x, . . . ,em0x are recognized as the principal components ofx.

I Settingy1 =e10x, y2=e20x, . . . , ym =em0x, we have c2= 1

λ1

y21+ 1 λ2

y22+· · ·+ 1 λm

y2m

I This equation defines an ellipsoid (since λ1, λ2, . . . , λm are positive) in a coordinate system with axes y1, y2, . . . , ym lying

(22)

Geometrical Interpretation

I Settingµ= 0in (∗) we can write, c2 =x0Σ−1x= 1

λ1

(e10

x)2+ 1 λ2

(e20

x)2+· · ·+ 1 λm

(em0

x)2 wheree10x,e20x, . . . ,em0x are recognized as the principal components ofx.

I Settingy1 =e10x, y2=e20x, . . . , ym =em0x, we have c2= 1

λ1

y21+ 1 λ2

y22+· · ·+ 1 λm

y2m

I This equation defines an ellipsoid (since λ1, λ2, . . . , λm are positive) in a coordinate system with axes y1, y2, . . . , ym lying

(23)

Geometrical Interpretation

I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.

I The remaining minor axes lies in the directions defined by e2, . . . ,em.

I The principal components

y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.

I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily

principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].

I Whenµ6= 0, it is the mean-centered principal component

(24)

Geometrical Interpretation

I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.

I The remaining minor axes lies in the directions defined by e2, . . . ,em.

I The principal components

y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.

I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily

principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].

I Whenµ6= 0, it is the mean-centered principal component

(25)

Geometrical Interpretation

I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.

I The remaining minor axes lies in the directions defined by e2, . . . ,em.

I The principal components

y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.

I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily

principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].

I Whenµ6= 0, it is the mean-centered principal component

(26)

Geometrical Interpretation

I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.

I The remaining minor axes lies in the directions defined by e2, . . . ,em.

I The principal components

y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.

I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily

principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].

I Whenµ6= 0, it is the mean-centered principal component

(27)

Geometrical Interpretation

I Ifλ1 is the largest eigenvalue, then the major axis lies in the direction e1.

I The remaining minor axes lies in the directions defined by e2, . . . ,em.

I The principal components

y1 =e10x, y2=e20x, . . . , ym=em0xlie in the directions of the axes of a constant density ellipsoid.

I Therefore any point on thejth ellipsoid axis has xcoordinate proportional toej0= [ej1, ej2, . . . , ejm]and necessarily

principal component coordinates of the form [0, . . . ,0, yj,0, . . . ,0].

I Whenµ6= 0, it is the mean-centered principal component

(28)

Special Covariance Structures

I Case I : Suppose Σbe the diagonal matrix

Σ =

σ11 0 . . . 0

0 σ22 . . . 0 ... ... . .. ...

0 0 . . . σmm

. . .(i)

I Settingej = [0, . . . ,0,1,0, . . . ,0], with 1 in the jth position, we observe that

σ11 0 . . . 0

0 σ22 . . . 0

... ... . .. ...

0 0 . . . σmm

 0

... 0 1 0 ..

=

 0

... 0 σjj

0 ..

or Σejjjej

(29)

Special Covariance Structures

I Case I : Suppose Σbe the diagonal matrix

Σ =

σ11 0 . . . 0

0 σ22 . . . 0 ... ... . .. ...

0 0 . . . σmm

. . .(i)

I Settingej = [0, . . . ,0,1,0, . . . ,0], with 1 in the jth position, we observe that

σ11 0 . . . 0

0 σ22 . . . 0 ... ... . .. ...

0 0 . . . σmm

 0

... 0 1 0 ..

=

 0

... 0 σjj

0 ..

or Σejjjej

(30)

Special Covariance Structures

Ijj,ej) is the jth eigenvalue-eigenvector pair.

I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.

I For a covariance matrix like(i) nothing is gained by extracting the principal components.

I IfX∼Nm(µ,Σ), the contours of constant density are

ellipsoids whose axes already lie in the directions of maximum variation.

I Consequently there is no need to rotate the coordinate system.

(31)

Special Covariance Structures

Ijj,ej) is the jth eigenvalue-eigenvector pair.

I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.

I For a covariance matrix like(i) nothing is gained by extracting the principal components.

I IfX∼Nm(µ,Σ), the contours of constant density are

ellipsoids whose axes already lie in the directions of maximum variation.

I Consequently there is no need to rotate the coordinate system.

(32)

Special Covariance Structures

Ijj,ej) is the jth eigenvalue-eigenvector pair.

I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.

I For a covariance matrix like(i) nothing is gained by extracting the principal components.

I IfX∼Nm(µ,Σ), the contours of constant density are

ellipsoids whose axes already lie in the directions of maximum variation.

I Consequently there is no need to rotate the coordinate system.

(33)

Special Covariance Structures

Ijj,ej) is the jth eigenvalue-eigenvector pair.

I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.

I For a covariance matrix like(i) nothing is gained by extracting the principal components.

I IfX∼Nm(µ,Σ), the contours of constant density are

ellipsoids whose axes already lie in the directions of maximum variation.

I Consequently there is no need to rotate the coordinate system.

(34)

Special Covariance Structures

Ijj,ej) is the jth eigenvalue-eigenvector pair.

I Since the linear combination ej0X=Xj, the set of principal components is just the original set of uncorrelated random variables.

I For a covariance matrix like(i) nothing is gained by extracting the principal components.

I IfX∼Nm(µ,Σ), the contours of constant density are

ellipsoids whose axes already lie in the directions of maximum variation.

I Consequently there is no need to rotate the coordinate system.

(35)

Special Covariance Structures

I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.

I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are

convenient choices for the eigenvectors.

I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.

I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.

(36)

Special Covariance Structures

I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.

I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are

convenient choices for the eigenvectors.

I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.

I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.

(37)

Special Covariance Structures

I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.

I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are

convenient choices for the eigenvectors.

I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.

I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.

(38)

Special Covariance Structures

I Standardization does not substantially alter the situation for theΣ in(i). In that caseρ=I, them×m identity matrix.

I Clearly ρej = 1ej, so that the eigenvalue 1 has multiplicitym andej0= [0, . . . ,0,1,0, . . . ,0], j= 1,2, . . . , m, are

convenient choices for the eigenvectors.

I The principal components determined from ρare also the original variables Z1, Z2, . . . , Zm.

I Since the eigenvalues are equal, the multivariate normal ellipsoids of constant density are spheroids.

(39)

Special Covariance Structures

I Case II : Consider another covariance matrix, that often describes the correspondence among certain biological variables (e.g. size of living organisms)

Σ =

σ2 ρσ2 . . . ρσ2 ρσ2 σ2 . . . ρσ2 ... ... . .. ... ρσ2 ρσ2 . . . σ2

. . .(ii)

The resulting correlation matrix

ρ=

1 ρ . . . ρ ρ 1 . . . ρ ... ... . .. ...

. . .(iii)

(40)

Special Covariance Structures

I The matrix in (iii) implies that the variablesX1, X2, . . . , Xm are equally correlated.

I The meigenvalues of the correlation matrix can be divided into two groups. When ρ is positive the largest is

λ1 = 1 + (m−1)ρ . . .(iv)

with associated eigenvector,

e10 = 1

√m, 1

√m,· · ·, 1

√m

. . .(v) The remaining m−1 eigenvalues are

λ12 =· · ·=λp = 1−ρ

(41)

Special Covariance Structures

I The matrix in (iii) implies that the variablesX1, X2, . . . , Xm are equally correlated.

I The meigenvalues of the correlation matrix can be divided into two groups. When ρ is positive the largest is

λ1 = 1 + (m−1)ρ . . .(iv)

with associated eigenvector,

e10 = 1

√m, 1

√m,· · ·, 1

√m

. . .(v) The remaining m−1 eigenvalues are

λ12 =· · ·=λp = 1−ρ

(42)

Special Covariance Structures

I The matrix in (iii) implies that the variablesX1, X2, . . . , Xm are equally correlated.

I The meigenvalues of the correlation matrix can be divided into two groups. When ρ is positive the largest is

λ1 = 1 + (m−1)ρ . . .(iv)

with associated eigenvector,

e10 = 1

√m, 1

√m,· · ·, 1

√m

. . .(v) The remaining m−1 eigenvalues are

λ12 =· · ·=λp = 1−ρ

(43)

Special Covariance Structures

I One choice for these eigenvectors is e20 =

h1

1×2,−11×2,0,· · ·,0 i

e30 = h1

2×3,2×31 ,−22×3,0,· · ·,0 i

... ...

ej0 =

1

(j−1)j,· · · ,√ 1

(j−1)j,√−(j−1)

(j−1)j,0,· · · ,0

... ...

em0 =

1

(m−1)m,· · · ,√ 1

(m−1)m,√−(m−1)

(m−1)m

(44)

Special Covariance Structures

I The first principal component,

Y1=e10Z= 1

√m

m

X

j=1

Zj

is proportional to the sum of the m standardized variables.

I Might be regarded as an index with equal weights.

I This principal component explains a proportion λ1

m = 1 + (m−1)ρ

m =ρ+ 1−ρ

m . . .(vi)

of the total population variation.

(45)

Special Covariance Structures

I The first principal component,

Y1=e10Z= 1

√m

m

X

j=1

Zj

is proportional to the sum of the m standardized variables.

I Might be regarded as an index with equal weights.

I This principal component explains a proportion λ1

m = 1 + (m−1)ρ

m =ρ+ 1−ρ

m . . .(vi)

of the total population variation.

(46)

Special Covariance Structures

I The first principal component,

Y1=e10Z= 1

√m

m

X

j=1

Zj

is proportional to the sum of the m standardized variables.

I Might be regarded as an index with equal weights.

I This principal component explains a proportion λ1

m = 1 + (m−1)ρ

m =ρ+ 1−ρ

m . . .(vi)

of the total population variation.

(47)

Special Covariance Structures

I λ1/m≈ρ for ρ close to 1 orm .

For example, ifρ= 0.80 andm= 5, the first component explains 84%of the total variability.

I Whenρ is near 1, a very small proportion of total variance is explained by the lastm−1components.

I In this case we retain only the first principal componentY1.

(48)

Special Covariance Structures

I λ1/m≈ρ for ρ close to 1 orm .

For example, ifρ= 0.80 andm= 5, the first component explains 84%of the total variability.

I Whenρ is near 1, a very small proportion of total variance is explained by the lastm−1components.

I In this case we retain only the first principal componentY1.

(49)

Special Covariance Structures

I λ1/m≈ρ for ρ close to 1 orm .

For example, ifρ= 0.80 andm= 5, the first component explains 84%of the total variability.

I Whenρ is near 1, a very small proportion of total variance is explained by the lastm−1components.

I In this case we retain only the first principal componentY1.

(50)

Scree Plot

I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.

I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.

I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.

I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.

I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about

(51)

Scree Plot

I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.

I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.

I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.

I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.

I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about

(52)

Scree Plot

I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.

I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.

I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.

I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.

I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about

(53)

Scree Plot

I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.

I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.

I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.

I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.

I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about

(54)

Scree Plot

I A Scree Plot is an useful visual aid to determine the appropriate number of principal components.

I A scree plot is a plot of λˆj versus j where λˆ1≥ˆλ2 ≥ · · · ≥λˆj ≥0.

I This would be a downward sloping curve since the amount of variability accounted for by successive p.c.’s are less and less.

I To determine the appropriate number of components, we look for a kink (or an elbow) in the scree plot, since it means that the curve flattens out after this and very little variability can be explained by the later principal components.

I The number of components is taken to be the point at which the remaining eigenvalues are relatively small and all about

(55)

Summary

I The principal components of standardized variables are looked into.

I The geometry of principal components is considered.

I Principal components of variables with some special covariance structures are discussed.

(56)

Summary

I The principal components of standardized variables are looked into.

I The geometry of principal components is considered.

I Principal components of variables with some special covariance structures are discussed.

(57)

Summary

I The principal components of standardized variables are looked into.

I The geometry of principal components is considered.

I Principal components of variables with some special covariance structures are discussed.

References

Related documents

Other documents of Village Panchayats about Gram Sabhas, details records related to developmental projects, services available, Salary details of staffs, budget and audit

A mention of popular assemblies and institutions like Sabha, Samiti and Vidatha is found in the Rig Veda (Bhattacharyya 1988, 32).A description of some of the important features

Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray, Analytics professional, Kolkata.. Content

“ Assess the effectiveness of hot water compress with Epsom salt among elderly women with knee joint pain residing at selected urban area Choolai in Chennai”. To

A study was conducted. A quasi experimental study to assess the effectiveness of isometric exercise on knee pain and functional mobility among the old age people with

Politically, Japan is divided into eight regions: Hokkaido, Tohoku, Kanto, Chubu, Kansai, Chugoku, Shikoku and Kyushu.. These regions cover the 47 prefectures (states)

You might have noticed that the equals() and hashCode() methods always access the properties of the other object via the getter methods. T his is important, since the

The characterization of the adsorbents includes estimation of various parameters such as proximate analysis (moisture content, ash content, volatile matter content