# Matching of cases and controls is frequently adopted to control the effects of known potential confounding variables

## Full text

(1)

### Module39:Matched Analysis-I

(2)

Development Team

Principal investigator: Dr. Bhaswati Ganguli,Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Sugata SenRoy,Department of Statistics, University of Calcutta

Content writer: Dr.Atanu Bhattacharjee, Division of Clinical Research and Biostatistics, Malabar Cancer Centre Content reviewer: Dr.Indranil Mukhopadhyay,Indian Statistical

Institute, Kolkata

(3)

Introduction

* Case-control studies are an appropriate and effective means of studying rare diseases.

* Matching of cases and controls is frequently adopted to control the effects of known potential confounding variables.

* The analysis of matched data needs specific statistical methods.

(4)

Introduction

* Case-control studies are an appropriate and effective means of studying rare diseases.

* Matching of cases and controls is frequently adopted to control the effects of known potential confounding variables.

* The analysis of matched data needs specific statistical methods.

(5)

Introduction

* Case-control studies are an appropriate and effective means of studying rare diseases.

* Matching of cases and controls is frequently adopted to control the effects of known potential confounding variables.

* The analysis of matched data needs specific statistical methods.

(6)

Stratification and Matching

Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.

* Stratified analysis is alike to using a regression model as the basis for the adjustment.

Matching Analysis is an alternative approach to control for the effects of a covariate.

* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.

(7)

Stratification and Matching

Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.

* Stratified analysis is alike to using a regression model as the basis for the adjustment.

Matching Analysis is an alternative approach to control for the effects of a covariate.

* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.

(8)

Stratification and Matching

Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.

* Stratified analysis is alike to using a regression model as the basis for the adjustment.

Matching Analysis is an alternative approach to control for the effects of a covariate.

* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.

(9)

Stratification and Matching

Stratified analysis is applicable to get an adjusted estimate and a test of the risk difference between the treatment or exposure groups through adjusting for the effect of an intervening confounding factor.

* Stratified analysis is alike to using a regression model as the basis for the adjustment.

Matching Analysis is an alternative approach to control for the effects of a covariate.

* Each member of a group is matched to a member of the other group with respect to the values of one or more covariates.

(10)

Frequency Matching

Frequency Matching The members of the comparison group are sampled within separate categories of a discrete covariate class like sex (male/female), decade of age (0-9 years, 10-19 years, etc.) and thereafter members of each group are matched within each category.

Example The cases may be stratified by sex and decade of age. Then within each category, such as females ages between 40-49 years, a separate sample of controls is selected from the control population in that category (i.e. females ages between 40-49 years) and therafter fequency of females in cases and control groups are compared.

(11)

Frequency Matching

Frequency Matching The members of the comparison group are sampled within separate categories of a discrete covariate class like sex (male/female), decade of age (0-9 years, 10-19 years, etc.) and thereafter members of each group are matched within each category.

Example The cases may be stratified by sex and decade of age. Then within each category, such as females ages between 40-49 years, a separate sample of controls is selected from the control population in that category (i.e. females ages between 40-49 years) and therafter fequency of females in cases and control groups are compared.

(12)

Frequency Matched Study

* A separate samples of cases and controls are observed in covariate categories.

* The goal is to get adequate numbers of subjects from each group for each stratum to provide a sufficient overall comparison between groups.

* A sufficient number of exposed and non-exposed cases and controls be sampled within each stratum to calculate and compare disease frequency.

(13)

Frequency Matched Study

* A separate samples of cases and controls are observed in covariate categories.

* The goal is to get adequate numbers of subjects from each group for each stratum to provide a sufficient overall comparison between groups.

* A sufficient number of exposed and non-exposed cases and controls be sampled within each stratum to calculate and compare disease frequency.

(14)

Frequency Matched Study

* A separate samples of cases and controls are observed in covariate categories.

* The goal is to get adequate numbers of subjects from each group for each stratum to provide a sufficient overall comparison between groups.

* A sufficient number of exposed and non-exposed cases and controls be sampled within each stratum to calculate and compare disease frequency.

(15)

Odds Ratio Estimation

Title1:- 2x2 contigency table from a matched analysis Control Exposed Control Not-exposed

Cases Exposed a b a+b

Cases Not-exposed c d c+d

a+c b+d T

(16)

Odds Ratio Estimation

OR is invariant to study design.

E= 1for exposed,0for notexposed D= 1for diseased,0for notdiseased

logitP[D= 1|E] =α0+βE logitP[E = 1|D] =α0+βD

To eliminate the nuisance parameterα0, we usally condition on the marginal table i.e. total no of exposed in a pair.

(17)

Odds Ratio Estimation

Consider each pair as a 2X2 table.

Pair1 Control Control

Exposed Not-exposed Total

Cases Exposed 1 0 1

Cases Not-exposed 0 0 0

1 0 1

(18)

Odds Ratio Estimation

Consider each pair as a 2X2 table.

Pair2 Control Control

Exposed Not-exposed Total

Cases Exposed 0 0 0

Cases Not-exposed 0 1 1

0 1 1

(19)

Odds Ratio Estimation

Consider each pair as a 2X2 table.

Pair3 Control Control

Exposed Not-exposed Total

Cases Exposed 0 1 1

Cases Not-exposed 0 0 0

0 1 1

(20)

Odds Ratio Estimation

Consider each pair as a 2X2 table.

Pair4 Control Control

Exposed Not-exposed Total

Cases Exposed 0 0 0

Cases Not-exposed 1 0 1

1 0 1

(21)

Odds Ratio Estimation

Now,

[ eα

0 +β

1+eα∗0. 1

1+eα∗0] [ eα

0

1+eα∗0. 1

1+eα∗0] + [ 1

1+eα∗0. eα

0

1+eα∗0]

= 1

1 +ψ (1)

Thus

loglik= ( ψ

ψ+ 1)b( 1

1 +ψ)c (2) logL=b[logψ−log(ψ+ 1)]−clog(1 +ψ) (3) logL=blogψ−(b+c)log(ψ+ 1)] (4)

(22)

Odds Ratio Estimation

Thus,

δlogL δψ = b

ψ− b+c

1 +ψ = 0 (5) b−bψ−bψ−cψ= 0 (6)

ψˆ= b

c =M LE (7)

The standard error of ψˆmay be derived from the inverse information matrix.

(23)

Frequency Matching Example

Title2:- Smoker and Non-Smoker Data

Age Case Case Control Control

Smoker Non-smoker Smoker Non-smoker

20-29 16 2 12 5

30-39 18 4 22 15

40-49 20 6 18 15

* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.

* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.

* It reveals3independent strata with an independent 2×2 table within each stratum.

* Apply stratified analysis(such as a Mantel-Haenszel

(24)

Frequency Matching Example

Title2:- Smoker and Non-Smoker Data

Age Case Case Control Control

Smoker Non-smoker Smoker Non-smoker

20-29 16 2 12 5

30-39 18 4 22 15

40-49 20 6 18 15

* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.

* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.

* It reveals3independent strata with an independent 2×2 table within each stratum.

* Apply stratified analysis(such as a Mantel-Haenszel

(25)

Frequency Matching Example

Title2:- Smoker and Non-Smoker Data

Age Case Case Control Control

Smoker Non-smoker Smoker Non-smoker

20-29 16 2 12 5

30-39 18 4 22 15

40-49 20 6 18 15

* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.

* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.

* It reveals3independent strata with an independent 2×2 table within each stratum.

* Apply stratified analysis(such as a Mantel-Haenszel

(26)

Frequency Matching Example

Title2:- Smoker and Non-Smoker Data

Age Case Case Control Control

Smoker Non-smoker Smoker Non-smoker

20-29 16 2 12 5

30-39 18 4 22 15

40-49 20 6 18 15

* The sampling steps for this work was to first select all 66 cases and stratify these cases based on age interval.

* Within each age stratum, controls were then selected and their exposure status explored for all cases and controls.

* It reveals3independent strata with an independent 2×2 table within each stratum.

* Apply stratified analysis(such as a Mantel-Haenszel

(27)

McNemar’s Large Sample Test

* Large-sample tests can be adopted from

multinomial/binomial distribution through normal approximation.

(28)

McNemar’s Large Sample Test

Let the cell propotion is assumed with pij = nNij, for theijth cell and E(pij) =πij, i= 1,2;j= 1,2.

pij ∼Nij,∑ )

= 1 N =

π11(1π11) π11π12 π11π21 π11π22

−π11π12 π12(1π12) −π12π21 −π12π22

π11π21 π12π21) π21(1π21) π21π22

π11π22 π12π22) π21π22 π22(1π22) .

(29)

McNemar’s Large Sample Test

H0 :π12=π21, we wish to apply the test statistics based on the difference in the discordant proportions p12−p21 of the form

z= p12−p21

Vˆ(p12−p21|H0)

(8)

with the variance of the difference evaluated under the null hypothesis.

V(p12−p21) = {12+π21)12−π21)}2 N

(30)

McNemar’s Large Sample Test

If it is assumed thatπ12=π21 under H0 then π12=π21=π andπ = πNd, then

V(p12−p21|H0) = π12+π21

N = πd

N p12+p21

N and

ZM odif ied= √ p12−p21

(p12+p21)/N = f−g

√f+g