• No results found

Paper: Regression Analysis III

N/A
N/A
Protected

Academic year: 2022

Share "Paper: Regression Analysis III"

Copied!
45
0
0

Loading.... (view fulltext now)

Full text

(1)

Subject: Statistics

Paper: Regression Analysis III

Module: Inference for the logistic model

Regression Analysis III 1 / 13

(2)

Development Team

Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta

Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray,Analytics professional, Kolkata

(3)

GLM

There are some situations where the usual linear models are not appropriate:

I the range of Y is restricted (e.g. binary, count)

I the variance of Y depends on the mean

Generalized linear models takes account of both these issues.

Regression Analysis III 3 / 13

(4)

GLM

There are some situations where the usual linear models are not appropriate:

I the range of Y is restricted (e.g. binary, count)

I the variance of Y depends on the mean

Generalized linear models takes account of both these issues.

(5)

GLM

There are some situations where the usual linear models are not appropriate:

I the range of Y is restricted (e.g. binary, count)

I the variance of Y depends on the mean

Generalized linear models takes account of both these issues.

Regression Analysis III 3 / 13

(6)

GLM

There are some situations where the usual linear models are not appropriate:

I the range of Y is restricted (e.g. binary, count)

I the variance of Y depends on the mean

Generalized linear models takes account of both these issues.

(7)

Likelihood estimation

I Estimation ofβ

¯ (From now on we will denote a vector as a letter with a ’

¯‘ below it)

I maximum likelihood method : β

¯ = (β1, β2....βp)0 δL

δβr = δ δβr

n

X

i=1

[yiθi−b(θi)

a(φi) +C(yi, φi)]

=

n

X

i=1

δLi δθi

δθii

δµi δηi

δηi δβr

=

n

X

i=1

hyi−b0i) a(φi)

i1 Vi

δµi δηi

xir

whereVi = i

i

Regression Analysis III 4 / 13

I

(8)

Likelihood estimation

I Estimation ofβ

¯ (From now on we will denote a vector as a letter with a ’

¯‘ below it)

I maximum likelihood method : β

¯ = (β1, β2....βp)0 δL

δβr = δ δβr

n

X

i=1

[yiθi−b(θi)

a(φi) +C(yi, φi)]

=

n

X

i=1

δLi δθi

δθii

δµi δηi

δηi δβr

=

n

Xhyi−b0i)i1 δµi xir

I

(9)

Likelihood estimation

I δL

δβr = 0→ no closed form solution is possible.

I Hence use iterative techniques.

I Let βˆ(m) be the solution at the mth iteration.

I βˆ(m+1)= ˆβ(m)+A−1u

Regression Analysis III 5 / 13

(10)

Likelihood estimation

I δL

δβr = 0→ no closed form solution is possible.

I Hence use iterative techniques.

I Let βˆ(m) be the solution at the mth iteration.

I βˆ(m+1)= ˆβ(m)+A−1u

(11)

Likelihood estimation

I δL

δβr = 0→ no closed form solution is possible.

I Hence use iterative techniques.

I Let βˆ(m) be the solution at the mth iteration.

I βˆ(m+1)= ˆβ(m)+A−1u

Regression Analysis III 5 / 13

(12)

Likelihood estimation

I δL

δβr = 0→ no closed form solution is possible.

I Hence use iterative techniques.

I Let βˆ(m) be the solution at the mth iteration.

I βˆ(m+1)= ˆβ(m)+A−1u

(13)

Likelihood estimation

I δL

δβr = 0→ no closed form solution is possible.

I Hence use iterative techniques.

I Let βˆ(m) be the solution at the mth iteration.

I βˆ(m+1)= ˆβ(m)+A−1u

Regression Analysis III 5 / 13

(14)

Score vector

Where u

¯

p×1= δL

δβ

¯

=

δL δβ1

δL δβ2

...

δL δβp

=

 u1

u2

... up

u¯:Score vector

(15)

Score vector

Where u

¯

p×1= δL

δβ

¯

=

δL δβ1

δL δβ2

...

δL δβp

=

 u1

u2

... up

u¯:Score vector

Regression Analysis III 6 / 13

(16)

Fisher information matrix

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2L

rδβs

I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.

(17)

Fisher information matrix

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2L

rδβs

I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.

Regression Analysis III 7 / 13

(18)

Fisher information matrix

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2L

rδβs

I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.

(19)

Fisher information matrix

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2L

rδβs

I A=E[−δβδβδ2L0]← Fisher information matrix We call it Method of scoring.

Regression Analysis III 7 / 13

(20)

Binary Data

I Response y takes two values , say , 0 and 1.

x¯= (x1, x2, ..., xp) : Set of covariates

πi=P[yi= 1|xi

¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi

¯ there are mi observations (forms a covariate class).

yi = no. of 1’s among the mi observation belonging to the

(21)

Binary Data

I Response y takes two values , say , 0 and 1.

x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi

¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi

¯ there are mi observations (forms a covariate class).

yi = no. of 1’s among the mi observation belonging to the covariate class xi

¯.

Regression Analysis III 8 / 13

(22)

Binary Data

I Response y takes two values , say , 0 and 1.

x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi

¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi

¯ there are mi observations (forms a covariate class).

yi = no. of 1’s among the mi observation belonging to the

(23)

Binary Data

I Response y takes two values , say , 0 and 1.

x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi

¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi

¯ there are mi observations (forms a covariate class).

yi = no. of 1’s among the mi observation belonging to the covariate class xi

¯.

Regression Analysis III 8 / 13

(24)

Binary Data

I Response y takes two values , say , 0 and 1.

x¯= (x1, x2, ..., xp) : Set of covariates πi=P[yi= 1|xi

¯],0≤πi ≤1 ,i= 1(1)n Suppose for each xi

¯ there are mi observations (forms a covariate class).

yi = no. of 1’s among the mi observation belonging to the

(25)

Binary Data

I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi

I ηi =xi

¯

0β

¯: linear predictor

I link functions : ηi =g(µi) =gi)

I logit : ηi=log1−ππ

i ,

I probit : ηi =Φ−1i)

I Complementary log-log :ηi=log{−log(1πi)}

Regression Analysis III 9 / 13

(26)

Binary Data

I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi

I ηi =xi

¯

0β

¯: linear predictor

I link functions : ηi =g(µi) =gi)

I logit : ηi=log1−ππ

i ,

I probit : ηi =Φ−1i)

(27)

Binary Data

I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi

I ηi =xi

¯

0β

¯: linear predictor

I link functions : ηi =g(µi) =gi)

I logit : ηi=log1−ππ

i ,

I probit : ηi =Φ−1i)

I Complementary log-log :ηi=log{−log(1πi)}

Regression Analysis III 9 / 13

(28)

Binary Data

I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi

I ηi =xi

¯

0β

¯: linear predictor

I link functions : ηi =g(µi) =gi)

I logit : ηi=log1−ππ

i ,

I probit : ηi =Φ−1i)

(29)

Binary Data

I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi

I ηi =xi

¯

0β

¯: linear predictor

I link functions : ηi =g(µi) =gi)

I logit : ηi=log1−ππ

i ,

I probit : ηi =Φ−1i)

I Complementary log-log :ηi=log{−log(1πi)}

Regression Analysis III 9 / 13

(30)

Binary Data

I Yi∼Bin(mi, πi) ,i= 1(1)n whereµi =E(Yi) =miπi

I ηi =xi

¯

0β

¯: linear predictor

I link functions : ηi =g(µi) =gi)

I logit : ηi=log1−ππ

i ,

I probit : ηi =Φ−1i)

(31)

Likelihood estimation for Binary Data

I Estimation ofβ

¯ : β

¯ estimated by maximum likelihood method using some iterative technique.

L(π¯;y

¯) =

n

X

i=1

[yilog πi 1−πi

+milog(1−πi)]+terms independent ofπ

ur = δL δβr

=

n

X

i=1

δL δπi

δπi δηi

δηi δβr

=

n

X

i=1

yi−miπi

πi(1−πi)(δπi

δηi

)xir

where(δπδηi

i) depends on the choice of the link function.

Regression Analysis III 10 / 13

(32)

Likelihood solution u¯

0= (u1, u2,· · · , up): u

¯ is the Score vector.

ars = E[− δ2L δβrδβs]

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)2xirxis

=

n

X

i=1

wixirxis wherewi = π mi

i(1−πi)(δπδηi

i)2 Let W =diag((wi))

I

I

I

(33)

Likelihood solution u¯

0= (u1, u2,· · · , up): u

¯ is the Score vector.

ars = E[− δ2L δβrδβs]

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)2xirxis

=

n

X

i=1

wixirxis wherewi = π mi

i(1−πi)(δπδηi

i)2

Let W =diag((wi))

∴ars =x

¯

0 rWx

¯s atqth iteration βˆ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

Regression Analysis III 11 / 13

I

I

I

(34)

Likelihood solution u¯

0= (u1, u2,· · · , up): u

¯ is the Score vector.

ars = E[− δ2L δβrδβs]

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)2xirxis

=

n

X

i=1

wixirxis wherewi = π mi

i(1−πi)(δπδηi

i)2 Let W =diag((wi))

I

I

I

(35)

Likelihood solution u¯

0= (u1, u2,· · · , up): u

¯ is the Score vector.

ars = E[− δ2L δβrδβs]

=

n

X

i=1

mi

πi(1−πi)(δπi

δηi

)2xirxis

=

n

X

i=1

wixirxis wherewi = π mi

i(1−πi)(δπδηi

i)2

Let W =diag((wi))

∴ars =x

¯

0 rWx

¯s atqth iteration βˆ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

Regression Analysis III 11 / 13

I

I

I

(36)

Logit link

I For the logit link δπδηi

ii(1−πi)

∴wi =miπi(1−πi)

I Alternatively,

ars=−E[

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂βr(miπi) 1 πi(1−πi)

∂πi

∂ηixir}]

I The first term is 0

(37)

Logit link

I For the logit link δπδηi

ii(1−πi)

∴wi =miπi(1−πi)

I Alternatively, ars=−E[

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂βr(miπi) 1 πi(1−πi)

∂πi

∂ηixir}]

I The first term is 0

I For logit link, the Newton-Raphson method and the method of scoring is same. We can also say that sample version is same as Fisher’s information matrix.

Regression Analysis III 12 / 13

(38)

Logit link

I For the logit link δπδηi

ii(1−πi)

∴wi =miπi(1−πi)

I Alternatively, ars=−E[

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂βr(miπi) 1 πi(1−πi)

∂πi

∂ηixir}]

I The first term is 0

(39)

Logit link

I For the logit link δπδηi

ii(1−πi)

∴wi =miπi(1−πi)

I Alternatively, ars=−E[

n

X

i=1

(yi−miπi) ∂

∂βr

{ 1 πi(1−πi)

∂πi

∂ηi

xir}

+{ ∂

∂βr(miπi) 1 πi(1−πi)

∂πi

∂ηixir}]

I The first term is 0

I For logit link, the Newton-Raphson method and the method of scoring is same. We can also say that sample version is same as Fisher’s information matrix.

Regression Analysis III 12 / 13

(40)

Summary

I No closed form solution is available for the parameter vector.

I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.

I The qth iteration givesˆβ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2δβL

(41)

Summary

I No closed form solution is available for the parameter vector.

I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.

I The qth iteration givesˆβ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2L

rδβs

I A=E[−δβδβδ2L0]← Fisher information matrix

Regression Analysis III 13 / 13

(42)

Summary

I No closed form solution is available for the parameter vector.

I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.

I The qth iteration givesˆβ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2δβL

(43)

Summary

I No closed form solution is available for the parameter vector.

I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.

I The qth iteration givesˆβ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2L

rδβs

I A=E[−δβδβδ2L0]← Fisher information matrix

Regression Analysis III 13 / 13

(44)

Summary

I No closed form solution is available for the parameter vector.

I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.

I The qth iteration givesˆβ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2δβL

(45)

Summary

I No closed form solution is available for the parameter vector.

I Iterative methods such as method of scoring or the Newton-Raphson methods can be used.

I The qth iteration givesˆβ

¯

q= ˆβ

¯

(q−1)

+ (A−1u

¯)

I Ap×p=−δβδβδ2L0 = ((ars))→ Sample information matrix.

I ars=−δβδ2L

rδβs

I A=E[−δβδβδ2L0]← Fisher information matrix

Regression Analysis III 13 / 13

References

Related documents

I Categorical variable such as social class, educational level where the categories are ordered but the distance or spacing between the categories is not defined or unknown, is

I Our major objective while analyzing a contingency table is testing for independence of the response and predictor variable.. I For large samples we can use the Pearsonian

Content reviewer: Dr. Kalyan Das, Professor, Department of Statistics, University of Calcutta.. Development Team.. Principal investigator: Dr. Bhaswati Ganguli, Professor, Department

This is to certify that this dissertation entitled “A RANDOMIZED STUDY TO COMPARE THE EFFICACY OF SINGLE DOSE ORAL AZITHROMYCIN VERSUS INJECTION

SELVAKUMAR.T, post graduate student (2013-2015) in the Department of Orthopaedic Surgery, Kilpauk Medical College, has done dissertation on “ANALYSIS OF ACETABULAR

Descriptive statistics; Design of experiments; Regression analysis; Sampling; SPC charts; Time series analysis 4.9.e Process control Need to approve processes

Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray, Analytics professional, Kolkata.. Content

Descriptive statistics; Design of experiments; Regression analysis; Sampling; SPC charts; Time series analysis 4.9.e Process control Need to approve processes