Paper: Regression Analysis III

(1)

Subject: Statistics

Paper: Regression Analysis III

Module: Inference for the logistic model

Regression Analysis III 1 / 13

(2)

Development Team

Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta

Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray,Analytics professional, Kolkata

(3)

GLM

There are some situations where the usual linear models are not appropriate:

I the range of Y is restricted (e.g. binary, count)

I the variance of Y depends on the mean

Generalized linear models takes account of both these issues.

(4)

GLM

(5)

GLM

(6)

GLM

(7)

Likelihood estimation

I Estimation ofβ

¯ (From now on we will denote a vector as a letter with a ’

¯‘ below it)

I maximum likelihood method : β

¯ = (β1, β2....βp)⁰ δL

δβ_r = δ δβ_r

n

X

i=1

[y_iθ_i−b(θ_i)

a(φ_i) +C(y_i, φ_i)]

=

n

X

i=1

δL_i δθ_i

δθ_i dµ_i

δµ_i δη_i

δη_i δβ_r

=

n

X

i=1

hy_i−b⁰(θ_i) a(φi)

i1 Vi

δµ_i δηi

x_ir

whereVi = ^dµ_dθⁱ

i

I

(8)

I Estimation ofβ

¯ (From now on we will denote a vector as a letter with a ’

¯‘ below it)

I maximum likelihood method : β

¯ = (β1, β2....βp)⁰ δL

δβ_r = δ δβ_r

n

X

i=1

[y_iθ_i−b(θ_i)

a(φ_i) +C(y_i, φ_i)]

=

n

X

i=1

δL_i δθ_i

δθ_i dµ_i

δµ_i δη_i

δη_i δβ_r

=

n

Xhy_i−b⁰(θ_i)i1 δµ_i x_ir

I

(9)

I δL

δβr = 0→ no closed form solution is possible.

I Hence use iterative techniques.

I Let βˆ^(m) be the solution at the m^th iteration.

I βˆ^(m+1)= ˆβ^(m)+A⁻¹u

(10)

I δL

I βˆ^(m+1)= ˆβ^(m)+A⁻¹u

(11)

I δL

I βˆ^(m+1)= ˆβ^(m)+A⁻¹u

(12)

I δL

I βˆ^(m+1)= ˆβ^(m)+A⁻¹u

(13)

I δL

I βˆ^(m+1)= ˆβ^(m)+A⁻¹u

(14)

Score vector

Where u

¯

p×1= ^δL

δβ

¯

=







δL δβ1

δL δβ2

...

δL δβp







=





 u1

u2

... up







u¯:Score vector

(15)

Score vector

Where u

¯

p×1= ^δL

δβ

¯

=







δL δβ1

δL δβ2

...

δL δβp







=





 u1

u2

... up







u¯:Score vector

(16)

Fisher information matrix

I A^p×p=−_δβδβ^δ²^L0 = ((a_rs))→ Sample information matrix.

I ars=−_δβ^δ²^L

rδβs

I A=E[−_δβδβ^δ²^L0]← Fisher information matrix We call it Method of scoring.

(17)

rδβs

(18)

rδβs

(19)

rδβs

(20)

Binary Data

I Response y takes two values , say , 0 and 1.

x¯= (x1, x2, ..., xp) : Set of covariates

π_i=P[y_i= 1|x_i

¯],0≤π_i ≤1 ,i= 1(1)n Suppose for each xi

¯ there are mi observations (forms a covariate class).

yi = no. of 1’s among the mi observation belonging to the

(21)

Binary Data

x¯= (x1, x2, ..., xp) : Set of covariates π_i=P[y_i= 1|x_i

yi = no. of 1’s among the mi observation belonging to the covariate class x_i

¯.

(22)

Binary Data

(23)

Binary Data

yi = no. of 1’s among the mi observation belonging to the covariate class x_i

¯.

(24)

Binary Data

(25)

Binary Data

I Yi∼Bin(mi, πi) ,i= 1(1)n whereµ_i =E(Y_i) =m_iπ_i

I η_i =x_i

¯

0β

¯: linear predictor

I link functions : η_i =g(µ_i) =g^∗(π_i)

I logit : ηi=log_1−π^π

i ,

I probit : η_i =Φ⁻¹(π_i)

I Complementary log-log :ηi=log{−log(1−πi)}

(26)

Binary Data

I η_i =x_i

¯

0β

i ,

(27)

Binary Data

I η_i =x_i

¯

0β

i ,

(28)

Binary Data

I η_i =x_i

¯

0β

i ,

(29)

Binary Data

I η_i =x_i

¯

0β

i ,

(30)

Binary Data

I η_i =x_i

¯

0β

i ,

(31)

Likelihood estimation for Binary Data

I Estimation ofβ

¯ : β

¯ estimated by maximum likelihood method using some iterative technique.

L(π¯;y

¯) =

n

X

i=1

[yilog π_i 1−πi

+milog(1−πi)]+terms independent ofπ

u_r = δL δβr

=

n

X

i=1

δL δπi

δπ_i δηi

δη_i δβr

=

n

X

i=1

yi−miπi

πi(1−πi)(δπi

δηi

)xir

where(^δπ_δηⁱ

i) depends on the choice of the link function.

(32)

Likelihood solution u¯

0= (u1, u2,· · · , up): u

¯ is the Score vector.

a_rs = E[− δ²L δβ_rδβ_s]

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)²xirxis

=

n

X

i=1

w_ix_irx_is wherewi = _π ^mⁱ

i(1−πi)(^δπ_δηⁱ

i)² Let W =diag((w_i))

I

(33)

0= (u1, u2,· · · , up): u

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)²xirxis

=

n

X

i=1

i)²

Let W =diag((w_i))

∴a_rs =x

¯

0 rWx

¯^s atq^th iteration βˆ

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

I

(34)

0= (u1, u2,· · · , up): u

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)²xirxis

=

n

X

i=1

i)² Let W =diag((w_i))

I

(35)

0= (u1, u2,· · · , up): u

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)²xirxis

=

n

X

i=1

i)²

Let W =diag((w_i))

∴a_rs =x

¯

0 rWx

¯^s atq^th iteration βˆ

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

I

(36)

Logit link

I For the logit link ^δπ_δηⁱ

i =π_i(1−π_i)

∴w_i =m_iπ_i(1−π_i)

I Alternatively,

ars=−E[

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂β_r(miπi) 1 π_i(1−π_i)

∂πi

∂η_ixir}]

I The first term is 0

(37)

Logit link

i =π_i(1−π_i)

I Alternatively, ars=−E[

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂πi

∂η_ixir}]

I For logit link, the Newton-Raphson method and the method of scoring is same. We can also say that sample version is same as Fisher’s information matrix.

(38)

Logit link

i =π_i(1−π_i)

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂πi

∂η_ixir}]

(39)

Logit link

i =π_i(1−π_i)

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂πi

∂η_ixir}]

I For logit link, the Newton-Raphson method and the method of scoring is same. We can also say that sample version is same as Fisher’s information matrix.

(40)

Summary

I No closed form solution is available for the parameter vector.

I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.

I The q^th iteration givesˆβ

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

I A^p×p=−_δβδβ^δ²^L0 = ((ars))→ Sample information matrix.

I a_rs=−_δβ^δ²_δβ^L

(41)

Summary

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

I a_rs=−_δβ^δ²^L

rδβs

I A=E[−_δβδβ^δ²^L0]← Fisher information matrix

(42)

Summary

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

(43)

Summary

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

rδβs

(44)

Summary

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

(45)

Summary

¯

q= ˆβ

¯

(q−1)

+ (A⁻¹u

¯)

rδβs