Paper: Regression Analysis III
Module: Quasi likelihood
Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray,Analytics professional, Kolkata
Content reviewer: Department of Statistics, University of Calcutta
I f(y) =e
yθ−b(θ) φ +c(y,φ)
, takinga(φ) =φ
I E(Y) =µ=b0(θ)
I V ar(Y) =b00(θ)φ=V(µ)φ
I f(y) =e
yθ−b(θ) φ +c(y,φ)
, takinga(φ) =φ
I E(Y) =µ=b0(θ)
I V ar(Y) =b00(θ)φ=V(µ)φ
I f(y) =e
yθ−b(θ) φ +c(y,φ)
, takinga(φ) =φ
I E(Y) =µ=b0(θ)
I V ar(Y) =b00(θ)φ=V(µ)φ
I L= yθ−b(θ)φ +c(y, φ)
I Differentiating Lw.r.tµ E[δL
δµ] = E[δL δθ.δθ
δµ]
= E[δL δθ]δθ
δµ
= E[y−b0(θ) φ ]δθ
δµ
I L= yθ−b(θ)φ +c(y, φ)
I Differentiating Lw.r.tµ E[δL
δµ] = E[δL δθ.δθ
δµ]
= E[δL δθ]δθ
δµ
= E[y−b0(θ) φ ]δθ
δµ
I L= yθ−b(θ)φ +c(y, φ)
I Differentiating Lw.r.tµ E[δL
δµ] = E[δL δθ.δθ
δµ]
= E[δL δθ]δθ
δµ
= E[y−b0(θ) φ ]δθ
δµ
I V ar[δLδµ] =V ar[δLδθ](δµδθ)2 =E[−δδθ2L2](δθδµ)2
I Where
E[δL
δθ]2 = E[−δ2L δθ2]
= b00(θ) φ
1 [b00(θ)]2
= 1
b00(θ).φ
I V ar[δLδµ] =V ar[δLδθ](δµδθ)2 =E[−δδθ2L2](δθδµ)2
I Where
E[δL
δθ]2 = E[−δ2L δθ2]
= b00(θ) φ
1 [b00(θ)]2
= 1
b00(θ).φ
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
I Very often the underlying distribution of the population may not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis. Recourse is then taken to a quasi - likelihood function which mimics the likelihood function and satisfied (I) and (II).
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
I Very often the underlying distribution of the population may not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis. Recourse is then taken to a quasi - likelihood function which mimics the likelihood function and satisfied (I) and (II).
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
I Very often the underlying distribution of the population may not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis. Recourse is then taken to a quasi - likelihood function which mimics the likelihood function and satisfied (I) and (II).
I Defineq = Vy−µ(µ).φ
I E(q) = 0and V(q) = V(µ).φ1
I q satisfied (I) and (II) and hence it can be used as a proxy to
δL δµ
I Defineq = Vy−µ(µ).φ
I E(q) = 0and V(q) = V(µ).φ1
I q satisfied (I) and (II) and hence it can be used as a proxy to
δL δµ
I Defineq = Vy−µ(µ).φ
I E(q) = 0and V(q) = V(µ).φ1
I q satisfied (I) and (II) and hence it can be used as a proxy to
δL δµ
I Defineq = Vy−µ(µ).φ
I E(q) = 0and V(q) = V(µ).φ1
I q satisfied (I) and (II) and hence it can be used as a proxy to
δL δµ
I Q=Rµ y
y−u φV(u)du
I δQ
δµ = φVy−µ(µ) =q
I Q as a substitute forL
I Only requirement to write q or Qis to identifyV(µ) as a function of µ.
I To estimate to β
¯ : maximizePn
i=1Qi w.r.tβ
¯ .
I Q=Rµ y
y−u φV(u)du
I δQ
δµ = φVy−µ(µ) =q
I Q as a substitute forL
I Only requirement to write q or Qis to identifyV(µ) as a function of µ.
I To estimate to β
¯ : maximizePn
i=1Qi w.r.tβ
¯ .
I Q=Rµ y
y−u φV(u)du
I δQ
δµ = φVy−µ(µ) =q
I Q as a substitute forL
I Only requirement to write q or Qis to identifyV(µ) as a function of µ.
I To estimate to β
¯ : maximizePn
i=1Qi w.r.tβ
¯ .
I Q=Rµ y
y−u φV(u)du
I δQ
δµ = φVy−µ(µ) =q
I Q as a substitute forL
I Only requirement to write q or Qis to identifyV(µ) as a function of µ.
I To estimate to β
¯ : maximizePn
i=1Qi w.r.tβ
¯ .
I Q=Rµ y
y−u φV(u)du
I δQ
δµ = φVy−µ(µ) =q
I Q as a substitute forL
I Only requirement to write q or Qis to identifyV(µ) as a function of µ.
I To estimate to β
¯ : maximizePn
i=1Qi w.r.tβ
¯ .
u∗r = δ δβr
n
X
i=1
Qi
=
n
X
i=1
δQi δµi
δµi δηi
δηi δβr
=
n
X
i=1
qi(δµi
δηi)xir
=
n
X
i=1
yi−µi V(µi)φ(δµi
δηi
)xir
Q = Z µ
y
y−u φu du
= 1
φ[ylogu−u]µy
= 1
φ[ylogµ−µ−ylogy+y]
u∗ = 1Xn yi−µi
(δµi
)x
Q = Z µ
y
y−u φu du
= 1
φ[ylogu−u]µy
= 1
φ[ylogµ−µ−ylogy+y]
u∗ = 1Xn yi−µi
(δµi
)x
Q = Z µ
y
y−u φu du
= 1
φ[ylogu−u]µy
= 1
φ[ylogµ−µ−ylogy+y]
u∗ = 1Xn yi−µi
(δµi
)x
Q = Z µ
y
y−u φ du
= 1
φ[u.y−u2 2 ]µy
= 1
φ[µy−µ2
2 −y2+y2 2 ]
= 1
2φ[−y2+ 2µy−µ2]
= −(y−µ)2
I V(µ) =µ(1−µ)
Q = Z µ
y
y−u φ(u(1−u))du
= 1 φ{
Z µ y
y
u(1−u)du− Z µ
y
du 1−u}
= 1 φ{y
Z µ y
1−u+u u(1−u)du−
Z µ y
du 1−u}
= 1 φ{y
Z µ y
1 udu+y
Z µ y
du 1−u −
Z µ y
du 1−u}
= 1
φ{[ylogu]µy −[ylog(1−u)]µy + [log(1−u)]µy}
= 1
φ{ylogµ−ylogy−ylog(1−µ)+ylog(1−y)+log(1−µ)−log(1−y)}
= 1
φ{ylog µ
1−µ+log(1−µ)−ylog y
1−y −log(1−y)}
ylog1−µµ +log(1−µ) ←log likelihood for the Binomial distribution
Q=
n
X
i=1
Qi Where
Qi = Z µi
yi
yi−u φV(u)du D= 2
n
X
i=1
Z yi
µi
yi−u V(u) du
I estimation : quasi score vector : u
¯= δQ
δβ
n ¯
=X
i=1
yi−µi φV(µi)
δµi δβ
¯
= 1
φDV−1(y
¯
−µ
¯) Where D= ((δµi
δβ
¯ )) V =diag((V(µi)))
¯y−µ
¯ =
Y1−µ1
Y2−µ2 ...
−
A = E[− δ2Q δβ
¯δβ
¯
0]
= −1 φ
n
X
i=1
E[(yi−µi) δ δβ
¯
0
1 V(µi)(δµi
δβ
¯
) + 1 V(µi)(δµi
δβ
¯ ) δ
δβ
¯
0(yi−µi)]
= 1 φ
n
X
i=1
(δµi δβ
¯ ) 1
V(µi)(δµi δβ
¯
0)]
= 1
φD0V−1D
At themth iteration βˆ
¯
(m) = βˆ
¯
(m−1)
+ [A−1u
¯] β
¯
=βˆ
¯
(m−1)
= βˆ
¯
(m−1)
+ [(D0V−1D)−1D0V−1(y
¯
−µ
¯)]β= ˆˆ β(m−1)
At themth iteration βˆ
¯
(m) = βˆ
¯
(m−1)
+ [A−1u
¯] β
¯
=βˆ
¯
(m−1)
= βˆ
¯
(m−1)
+ [(D0V−1D)−1D0V−1(y
¯
−µ
¯)]β= ˆˆ β(m−1)
I E(ˆβ
¯) =β
¯+O(n−1) D(ˆβ
¯) =φ(D0V−1D)−1+O(n−1)
φ(D0V−1D)−1 ← involves the overdispersion parameters
I Estimate of φ: No estimate of φcan be directly obtained from the Quasi - likelihood.
Conventionally φis estimated as - φˆ= 1
n−p
n
X
i=1
(yi−µˆi)2 V(ˆµi) This is moment estimation.
I E(ˆβ
¯) =β
¯+O(n−1) D(ˆβ
¯) =φ(D0V−1D)−1+O(n−1)
φ(D0V−1D)−1 ← involves the overdispersion parameters
I Estimate of φ: No estimate of φcan be directly obtained from the Quasi - likelihood.
Conventionally φis estimated as - φˆ= 1
n−p
n
X
i=1
(yi−µˆi)2 V(ˆµi) This is moment estimation.
not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis.
I To overcome this a quasi - likelihood function which mimics the likelihood function and satisfies (I) and (II) is used.
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis.
I To overcome this a quasi - likelihood function which mimics the likelihood function and satisfies (I) and (II) is used.
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis.
I To overcome this a quasi - likelihood function which mimics the likelihood function and satisfies (I) and (II) is used.
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis.
I To overcome this a quasi - likelihood function which mimics the likelihood function and satisfies (I) and (II) is used.
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
not be known (e.g in case of overdispersed data). In such cases , the likelihood can not be obtained and hence inferences can not be drawn on its basis.
I To overcome this a quasi - likelihood function which mimics the likelihood function and satisfies (I) and (II) is used.
I E[δLδµ]= 0 ...(I)
I V ar[δLδµ]= 1
b00(θ).φ ...(II)
I Quasi deviance function : D= 2Pn i=1
Ryi
µi
yi−u V(u)du
I estimation : quasi score vector : u
¯= 1φDV−1(y
¯
−µ
¯)
I Quasi information matrix : A= φ1D0V−1D
I Estimator : βˆ
¯
(m)= ˆβ
¯
(m−1)
+ [(D0V−1D)−1D0V−1(y
¯−µ
¯)]β= ˆˆ β(m−1)
I Estimate of φ: φˆ= n−p1 Pn i=1
(yi−ˆµi)2 V(ˆµi)
I Quasi deviance function : D= 2Pn i=1
Ryi
µi
yi−u V(u)du
I estimation : quasi score vector : u
¯= 1φDV−1(y
¯
−µ
¯)
I Quasi information matrix : A= φ1D0V−1D
I Estimator : βˆ
¯
(m)= ˆβ
¯
(m−1)
+ [(D0V−1D)−1D0V−1(y
¯−µ
¯)]β= ˆˆ β(m−1)
I Estimate of φ: φˆ= n−p1 Pn i=1
(yi−ˆµi)2 V(ˆµi)
I Quasi deviance function : D= 2Pn i=1
Ryi
µi
yi−u V(u)du
I estimation : quasi score vector : u
¯= 1φDV−1(y
¯
−µ
¯)
I Quasi information matrix : A= φ1D0V−1D
I Estimator : βˆ
¯
(m)= ˆβ
¯
(m−1)
+ [(D0V−1D)−1D0V−1(y
¯−µ
¯)]β= ˆˆ β(m−1)
I Estimate of φ: φˆ= n−p1 Pn i=1
(yi−ˆµi)2 V(ˆµi)
I Quasi deviance function : D= 2Pn i=1
Ryi
µi
yi−u V(u)du
I estimation : quasi score vector : u
¯= 1φDV−1(y
¯
−µ
¯)
I Quasi information matrix : A= φ1D0V−1D
I Estimator : βˆ
¯
(m)= ˆβ
¯
(m−1)
+ [(D0V−1D)−1D0V−1(y
¯−µ
¯)]β= ˆˆ β(m−1)
I Estimate of φ: φˆ= n−p1 Pn i=1
(yi−ˆµi)2 V(ˆµi)
I Quasi deviance function : D= 2Pn i=1
Ryi
µi
yi−u V(u)du
I estimation : quasi score vector : u
¯= 1φDV−1(y
¯
−µ
¯)
I Quasi information matrix : A= φ1D0V−1D
I Estimator : βˆ
¯
(m)= ˆβ
¯
(m−1)
+ [(D0V−1D)−1D0V−1(y
¯−µ
¯)]β= ˆˆ β(m−1)
I Estimate of φ: φˆ= n−p1 Pn i=1
(yi−ˆµi)2 V(ˆµi)
# install.packages("qcc")
require(qcc) ## contains the function qcc.overdispersion.test
# data from Wetherill and Brown (1991) pp. 212--213, 216--218:
x <- c(12,11,18,11,10,16,9,11,14,15,11,9,10,13,12, 8,12,13,10,12,13,16,12,18,16,10,16,10,12,14) size <- rep(50, length(x))
qcc.overdispersion.test(x,size)
x <- c(11,8,13,11,13,17,25,23,11,16,9,15,10,16,12, 8,9,15,4,12,12,12,15,17,14,17,12,12,7,16) qcc.overdispersion.test(x)
data(breslow.dat, package = "robust") names(breslow.dat)
head(breslow.dat)
summary(breslow.dat[c(6, 7, 8, 10)])
# plot distribution of post-treatment seizure counts opar <- par(no.readonly = TRUE)
par(mfrow = c(1, 2)) attach(breslow.dat)
hist(sumY, breaks = 20, xlab = "Seizure Count", main = "Distribution of Seizures")
boxplot(sumY ~ Trt, xlab="Treatment",main="Group Comparisons") par(opar)
family = poisson()) summary(fit)
# interpreting model parameters coef(fit)
exp(coef(fit))
# evaluating overdispersion
qcc.overdispersion.test(breslow.dat$sumY, type = "poisson")
# fit model with quasipoisson
fit.od <- glm(sumY ~ Base + Age + Trt, data = breslow.dat, family = quasipoisson())