Subject: Statistics
Paper: Biostatistics
Module39:Matched Analysis-I
Development Team
Principal investigator: Dr. Bhaswati Ganguli,Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Sugata SenRoy,Department of Statistics, University of Calcutta
Content writer: Dr.Atanu Bhattacharjee, Division of Clinical Research and Biostatistics, Malabar Cancer Centre Content reviewer: Dr.Indranil Mukhopadhyay,Indian Statistical
Institute, Kolkata
Introduction
* Case-control studies are an appropriate and effective means of studying rare diseases.
* Matching of cases and controls is frequently adopted to control the effects of known potential confounding variables.
* The analysis of matched data needs specific statistical methods.
Introduction
* Case-control studies are an appropriate and effective means of studying rare diseases.
* Matching of cases and controls is frequently adopted to control the effects of known potential confounding variables.
* The analysis of matched data needs specific statistical methods.
Introduction
* Case-control studies are an appropriate and effective means of studying rare diseases.
* Matching of cases and controls is frequently adopted to control the effects of known potential confounding variables.
* The analysis of matched data needs specific statistical methods.
Stratification and Matching
Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.
* Stratified analysis is alike to using a regression model as the basis for the adjustment.
Matching Analysis is an alternative approach to control for the effects of a covariate.
* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.
Stratification and Matching
Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.
* Stratified analysis is alike to using a regression model as the basis for the adjustment.
Matching Analysis is an alternative approach to control for the effects of a covariate.
* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.
Stratification and Matching
Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.
* Stratified analysis is alike to using a regression model as the basis for the adjustment.
Matching Analysis is an alternative approach to control for the effects of a covariate.
* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.
Stratification and Matching
Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.
* Stratified analysis is alike to using a regression model as the basis for the adjustment.
Matching Analysis is an alternative approach to control for the effects of a covariate.
* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.
Frequency Matching
Frequency Matching The members of the comparison group are sampled within separate categories of a discrete covariate class like sex (male/female), decade of age (0-9 years, 10-19 years, etc.) and thereafter members of each group are matched within each category.
Example The cases may be stratified by sex and decade of age. Then within each category, such as females ages between 40-49 years, a separate sample of controls is selected from the control population in that category (i.e. females ages between 40-49 years) and therafter fequency of females in cases and control groups are compared.
Frequency Matching
Frequency Matching The members of the comparison group are sampled within separate categories of a discrete covariate class like sex (male/female), decade of age (0-9 years, 10-19 years, etc.) and thereafter members of each group are matched within each category.
Example The cases may be stratified by sex and decade of age. Then within each category, such as females ages between 40-49 years, a separate sample of controls is selected from the control population in that category (i.e. females ages between 40-49 years) and therafter fequency of females in cases and control groups are compared.
Frequency Matched Study
* A separate samples of cases and controls are observed in covariate categories.
* The goal is to get adequate numbers of subjects from each group for each stratum to provide a sufficient overall comparison between groups.
* A sufficient number of exposed and non-exposed cases and controls be sampled within each stratum to calculate and compare disease frequency.
Frequency Matched Study
* A separate samples of cases and controls are observed in covariate categories.
* The goal is to get adequate numbers of subjects from each group for each stratum to provide a sufficient overall comparison between groups.
* A sufficient number of exposed and non-exposed cases and controls be sampled within each stratum to calculate and compare disease frequency.
Frequency Matched Study
* A separate samples of cases and controls are observed in covariate categories.
* The goal is to get adequate numbers of subjects from each group for each stratum to provide a sufficient overall comparison between groups.
* A sufficient number of exposed and non-exposed cases and controls be sampled within each stratum to calculate and compare disease frequency.
Odds Ratio Estimation
Title1:- 2x2 contigency table from a matched analysis Control Exposed Control Not-exposed
Cases Exposed a b a+b
Cases Not-exposed c d c+d
a+c b+d T
Odds Ratio Estimation
OR is invariant to study design.
E= 1for exposed,0for notexposed D= 1for diseased,0for notdiseased
logitP[D= 1|E] =α0+βE logitP[E = 1|D] =α∗0+βD
To eliminate the nuisance parameterα∗0, we usally condition on the marginal table i.e. total no of exposed in a pair.
Odds Ratio Estimation
Consider each pair as a 2X2 table.
Pair1 Control Control
Exposed Not-exposed Total
Cases Exposed 1 0 1
Cases Not-exposed 0 0 0
1 0 1
Odds Ratio Estimation
Consider each pair as a 2X2 table.
Pair2 Control Control
Exposed Not-exposed Total
Cases Exposed 0 0 0
Cases Not-exposed 0 1 1
0 1 1
Odds Ratio Estimation
Consider each pair as a 2X2 table.
Pair3 Control Control
Exposed Not-exposed Total
Cases Exposed 0 1 1
Cases Not-exposed 0 0 0
0 1 1
Odds Ratio Estimation
Consider each pair as a 2X2 table.
Pair4 Control Control
Exposed Not-exposed Total
Cases Exposed 0 0 0
Cases Not-exposed 1 0 1
1 0 1
Odds Ratio Estimation
Now,
[ eα
∗0 +β
1+eα∗0+β. 1
1+eα∗0] [ eα
∗0+β
1+eα∗0+β. 1
1+eα∗0] + [ 1
1+eα∗0+β. eα
∗0
1+eα∗0]
= 1
1 +ψ (1)
Thus
loglik= ( ψ
ψ+ 1)b( 1
1 +ψ)c (2) logL=b[logψ−log(ψ+ 1)]−clog(1 +ψ) (3) logL=blogψ−(b+c)log(ψ+ 1)] (4)
Odds Ratio Estimation
Thus,
δlogL δψ = b
ψ− b+c
1 +ψ = 0 (5) b−bψ−bψ−cψ= 0 (6)
ψˆ= b
c =M LE (7)
The standard error of ψˆmay be derived from the inverse information matrix.
Frequency Matching Example
Title2:- Smoker and Non-Smoker Data
Age Case Case Control Control
Smoker Non-smoker Smoker Non-smoker
20-29 16 2 12 5
30-39 18 4 22 15
40-49 20 6 18 15
* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.
* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.
* It reveals3independent strata with an independent 2×2 table within each stratum.
* Apply stratified analysis(such as a Mantel-Haenszel
Frequency Matching Example
Title2:- Smoker and Non-Smoker Data
Age Case Case Control Control
Smoker Non-smoker Smoker Non-smoker
20-29 16 2 12 5
30-39 18 4 22 15
40-49 20 6 18 15
* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.
* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.
* It reveals3independent strata with an independent 2×2 table within each stratum.
* Apply stratified analysis(such as a Mantel-Haenszel
Frequency Matching Example
Title2:- Smoker and Non-Smoker Data
Age Case Case Control Control
Smoker Non-smoker Smoker Non-smoker
20-29 16 2 12 5
30-39 18 4 22 15
40-49 20 6 18 15
* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.
* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.
* It reveals3independent strata with an independent 2×2 table within each stratum.
* Apply stratified analysis(such as a Mantel-Haenszel
Frequency Matching Example
Title2:- Smoker and Non-Smoker Data
Age Case Case Control Control
Smoker Non-smoker Smoker Non-smoker
20-29 16 2 12 5
30-39 18 4 22 15
40-49 20 6 18 15
* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.
* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.
* It reveals3independent strata with an independent 2×2 table within each stratum.
* Apply stratified analysis(such as a Mantel-Haenszel
McNemar’s Large Sample Test
* Large-sample tests can be adopted from
multinomial/binomial distribution through normal approximation.
McNemar’s Large Sample Test
Let the cell propotion is assumed with pij = nNij, for theijth cell and E(pij) =πij, i= 1,2;j= 1,2.
pij ∼N(πij,∑ )
∑= 1 N =
π11(1−π11) −π11π12 −π11π21 −π11π22
−π11π12 π12(1−π12) −π12π21 −π12π22
−π11π21 −π12π21) −π21(1−π21) −π21π22
−π11π22 −π12π22) −π21π22 −π22(1−π22) .
McNemar’s Large Sample Test
H0 :π12=π21, we wish to apply the test statistics based on the difference in the discordant proportions p12−p21 of the form
z= p12−p21
√Vˆ(p12−p21|H0)
(8)
with the variance of the difference evaluated under the null hypothesis.
V(p12−p21) = {(π12+π21)−(π12−π21)}2 N
McNemar’s Large Sample Test
If it is assumed thatπ12=π21 under H0 then π12=π21=π andπ = πNd, then
V(p12−p21|H0) = π12+π21
N = πd
N ≃ p12+p21
N and
ZM odif ied= √ p12−p21
(p12+p21)/N = f−g
√f+g
is asymptotically normally distributed underH0.