Subject: Statistics
Paper: Regression Analysis III
Module: Inference for the logistic model
Regression Analysis III 1 / 13
Development Team
Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray,Analytics professional, Kolkata
GLM
There are some situations where the usual linear models are not appropriate:
I the range of Y is restricted (e.g. binary, count)
I the variance of Y depends on the mean
Generalized linear models takes account of both these issues.
Regression Analysis III 3 / 13
GLM
There are some situations where the usual linear models are not appropriate:
I the range of Y is restricted (e.g. binary, count)
I the variance of Y depends on the mean
Generalized linear models takes account of both these issues.
GLM
There are some situations where the usual linear models are not appropriate:
I the range of Y is restricted (e.g. binary, count)
I the variance of Y depends on the mean
Generalized linear models takes account of both these issues.
Regression Analysis III 3 / 13
GLM
There are some situations where the usual linear models are not appropriate:
I the range of Y is restricted (e.g. binary, count)
I the variance of Y depends on the mean
Generalized linear models takes account of both these issues.
Likelihood estimation
I Estimation ofβ
¯ (From now on we will denote a vector as a letter with a ’
¯‘ below it)
I maximum likelihood method : β
¯ = (β1, β2....βp)0 δL
δβr = δ δβr
n
X
i=1
[yiθi−b(θi)
a(φi) +C(yi, φi)]
=
n
X
i=1
δLi δθi
δθi dµi
δµi δηi
δηi δβr
=
n
X
i=1
hyi−b0(θi) a(φi)
i1 Vi
δµi δηi
xir
whereVi = dµdθi
i
Regression Analysis III 4 / 13
I
Likelihood estimation
I Estimation ofβ
¯ (From now on we will denote a vector as a letter with a ’
¯‘ below it)
I maximum likelihood method : β
¯ = (β1, β2....βp)0 δL
δβr = δ δβr
n
X
i=1
[yiθi−b(θi)
a(φi) +C(yi, φi)]
=
n
X
i=1
δLi δθi
δθi dµi
δµi δηi
δηi δβr
=
n
Xhyi−b0(θi)i1 δµi xir
I
Likelihood estimation
I δL
δβr = 0→ no closed form solution is possible.
I Hence use iterative techniques.
I Let βˆ(m) be the solution at the mth iteration.
I βˆ(m+1)= ˆβ(m)+A−1u
Regression Analysis III 5 / 13
Likelihood estimation
I δL
δβr = 0→ no closed form solution is possible.
I Hence use iterative techniques.
I Let βˆ(m) be the solution at the mth iteration.
I βˆ(m+1)= ˆβ(m)+A−1u
Likelihood estimation
I δL
δβr = 0→ no closed form solution is possible.
I Hence use iterative techniques.
I Let βˆ(m) be the solution at the mth iteration.
I βˆ(m+1)= ˆβ(m)+A−1u
Regression Analysis III 5 / 13
Likelihood estimation
I δL
δβr = 0→ no closed form solution is possible.
I Hence use iterative techniques.
I Let βˆ(m) be the solution at the mth iteration.
I βˆ(m+1)= ˆβ(m)+A−1u
Likelihood estimation
I δL
δβr = 0→ no closed form solution is possible.
I Hence use iterative techniques.
I Let βˆ(m) be the solution at the mth iteration.
I βˆ(m+1)= ˆβ(m)+A−1u
Regression Analysis III 5 / 13
Score vector
Where u
¯
p×1= δL
δβ
¯
=
δL δβ1
δL δβ2
...
δL δβp
=
u1
u2
... up
u¯:Score vector
Score vector
Where u
¯
p×1= δL
δβ
¯
=
δL δβ1
δL δβ2
...
δL δβp
=
u1
u2
... up
u¯:Score vector
Regression Analysis III 6 / 13
Fisher information matrix
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2L
rδβs
I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.
Fisher information matrix
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2L
rδβs
I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.
Regression Analysis III 7 / 13
Fisher information matrix
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2L
rδβs
I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.
Fisher information matrix
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2L
rδβs
I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.
Regression Analysis III 7 / 13
Binary Data
I Response y takes two values , say , 0 and 1.
x¯= (x1, x2, ..., xp) : Set of covariates
πi=P[yi= 1|xi
¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi
¯ there are mi observations (forms a covariate class).
yi = no. of 1’s among the mi observation belonging to the
Binary Data
I Response y takes two values , say , 0 and 1.
x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi
¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi
¯ there are mi observations (forms a covariate class).
yi = no. of 1’s among the mi observation belonging to the covariate class xi
¯.
Regression Analysis III 8 / 13
Binary Data
I Response y takes two values , say , 0 and 1.
x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi
¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi
¯ there are mi observations (forms a covariate class).
yi = no. of 1’s among the mi observation belonging to the
Binary Data
I Response y takes two values , say , 0 and 1.
x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi
¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi
¯ there are mi observations (forms a covariate class).
yi = no. of 1’s among the mi observation belonging to the covariate class xi
¯.
Regression Analysis III 8 / 13
Binary Data
I Response y takes two values , say , 0 and 1.
x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi
¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi
¯ there are mi observations (forms a covariate class).
yi = no. of 1’s among the mi observation belonging to the
Binary Data
I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi
I ηi =xi
¯
0β
¯: linear predictor
I link functions : ηi =g(µi) =g∗(πi)
I logit : ηi=log1−ππ
i ,
I probit : ηi =Φ−1(πi)
I Complementary log-log :ηi=log{−log(1−πi)}
Regression Analysis III 9 / 13
Binary Data
I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi
I ηi =xi
¯
0β
¯: linear predictor
I link functions : ηi =g(µi) =g∗(πi)
I logit : ηi=log1−ππ
i ,
I probit : ηi =Φ−1(πi)
Binary Data
I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi
I ηi =xi
¯
0β
¯: linear predictor
I link functions : ηi =g(µi) =g∗(πi)
I logit : ηi=log1−ππ
i ,
I probit : ηi =Φ−1(πi)
I Complementary log-log :ηi=log{−log(1−πi)}
Regression Analysis III 9 / 13
Binary Data
I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi
I ηi =xi
¯
0β
¯: linear predictor
I link functions : ηi =g(µi) =g∗(πi)
I logit : ηi=log1−ππ
i ,
I probit : ηi =Φ−1(πi)
Binary Data
I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi
I ηi =xi
¯
0β
¯: linear predictor
I link functions : ηi =g(µi) =g∗(πi)
I logit : ηi=log1−ππ
i ,
I probit : ηi =Φ−1(πi)
I Complementary log-log :ηi=log{−log(1−πi)}
Regression Analysis III 9 / 13
Binary Data
I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi
I ηi =xi
¯
0β
¯: linear predictor
I link functions : ηi =g(µi) =g∗(πi)
I logit : ηi=log1−ππ
i ,
I probit : ηi =Φ−1(πi)
Likelihood estimation for Binary Data
I Estimation ofβ
¯ : β
¯ estimated by maximum likelihood method using some iterative technique.
L(π¯;y
¯) =
n
X
i=1
[yilog πi 1−πi
+milog(1−πi)]+terms independent ofπ
ur = δL δβr
=
n
X
i=1
δL δπi
δπi δηi
δηi δβr
=
n
X
i=1
yi−miπi
πi(1−πi)(δπi
δηi
)xir
where(δπδηi
i) depends on the choice of the link function.
Regression Analysis III 10 / 13
Likelihood solution u¯
0= (u1, u2,· · · , up): u
¯ is the Score vector.
ars = E[− δ2L δβrδβs]
=
n
X
i=1
mi
πi(1−πi)(δπi
δηi
)2xirxis
=
n
X
i=1
wixirxis wherewi = π mi
i(1−πi)(δπδηi
i)2 Let W =diag((wi))
I
I
I
Likelihood solution u¯
0= (u1, u2,· · · , up): u
¯ is the Score vector.
ars = E[− δ2L δβrδβs]
=
n
X
i=1
mi
πi(1−πi)(δπi
δηi
)2xirxis
=
n
X
i=1
wixirxis wherewi = π mi
i(1−πi)(δπδηi
i)2
Let W =diag((wi))
∴ars =x
¯
0 rWx
¯s atqth iteration βˆ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
Regression Analysis III 11 / 13
I
I
I
Likelihood solution u¯
0= (u1, u2,· · · , up): u
¯ is the Score vector.
ars = E[− δ2L δβrδβs]
=
n
X
i=1
mi
πi(1−πi)(δπi
δηi
)2xirxis
=
n
X
i=1
wixirxis wherewi = π mi
i(1−πi)(δπδηi
i)2 Let W =diag((wi))
I
I
I
Likelihood solution u¯
0= (u1, u2,· · · , up): u
¯ is the Score vector.
ars = E[− δ2L δβrδβs]
=
n
X
i=1
mi
πi(1−πi)(δπi
δηi
)2xirxis
=
n
X
i=1
wixirxis wherewi = π mi
i(1−πi)(δπδηi
i)2
Let W =diag((wi))
∴ars =x
¯
0 rWx
¯s atqth iteration βˆ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
Regression Analysis III 11 / 13
I
I
I
Logit link
I For the logit link δπδηi
i =πi(1−πi)
∴wi =miπi(1−πi)
I Alternatively,
ars=−E[
n
X
i=1
(yi−miπi) ∂
∂βr
{ 1 πi(1−πi)
∂πi
∂ηi
xir}
+{ ∂
∂βr(miπi) 1 πi(1−πi)
∂πi
∂ηixir}]
I The first term is 0
Logit link
I For the logit link δπδηi
i =πi(1−πi)
∴wi =miπi(1−πi)
I Alternatively, ars=−E[
n
X
i=1
(yi−miπi) ∂
∂βr
{ 1 πi(1−πi)
∂πi
∂ηi
xir}
+{ ∂
∂βr(miπi) 1 πi(1−πi)
∂πi
∂ηixir}]
I The first term is 0
I For logit link, the Newton-Raphson method and the method of scoring is same. We can also say that sample version is same as Fisher’s information matrix.
Regression Analysis III 12 / 13
Logit link
I For the logit link δπδηi
i =πi(1−πi)
∴wi =miπi(1−πi)
I Alternatively, ars=−E[
n
X
i=1
(yi−miπi) ∂
∂βr
{ 1 πi(1−πi)
∂πi
∂ηi
xir}
+{ ∂
∂βr(miπi) 1 πi(1−πi)
∂πi
∂ηixir}]
I The first term is 0
Logit link
I For the logit link δπδηi
i =πi(1−πi)
∴wi =miπi(1−πi)
I Alternatively, ars=−E[
n
X
i=1
(yi−miπi) ∂
∂βr
{ 1 πi(1−πi)
∂πi
∂ηi
xir}
+{ ∂
∂βr(miπi) 1 πi(1−πi)
∂πi
∂ηixir}]
I The first term is 0
I For logit link, the Newton-Raphson method and the method of scoring is same. We can also say that sample version is same as Fisher’s information matrix.
Regression Analysis III 12 / 13
Summary
I No closed form solution is available for the parameter vector.
I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.
I The qth iteration givesˆβ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2δβL
Summary
I No closed form solution is available for the parameter vector.
I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.
I The qth iteration givesˆβ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2L
rδβs
I A=E[−δβδβδ2L0]← Fisher information matrix
Regression Analysis III 13 / 13
Summary
I No closed form solution is available for the parameter vector.
I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.
I The qth iteration givesˆβ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2δβL
Summary
I No closed form solution is available for the parameter vector.
I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.
I The qth iteration givesˆβ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2L
rδβs
I A=E[−δβδβδ2L0]← Fisher information matrix
Regression Analysis III 13 / 13
Summary
I No closed form solution is available for the parameter vector.
I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.
I The qth iteration givesˆβ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2δβL
Summary
I No closed form solution is available for the parameter vector.
I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.
I The qth iteration givesˆβ
¯
q= ˆβ
¯
(q−1)
+ (A−1u
¯)
I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.
I ars=−δβδ2L
rδβs
I A=E[−δβδβδ2L0]← Fisher information matrix
Regression Analysis III 13 / 13