Subject: Statistics
Paper: Regression Analysis III
Module: Prospective and Retrospective Studies
Regression Analysis III 1 / 18
Development Team
Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta
Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University Sujit Kumar Ray,Analytics professional, Kolkata
Content reviewer: Department of Statistics, University of Calcutta
Study designs
Q: What are the different types of study design ? Q: What do they mean?
1. RetrospectiveStudy design
A study design that traces back into the past.
2. Prospectivestudy design
A study design that looks into the future.
Regression Analysis III Introduction to Study designs 3 / 18
Study designs
Q: What are the different types of study design ? Q: What do they mean?
1. RetrospectiveStudy design
A study design that traces back into the past.
2. Prospectivestudy design
A study design that looks into the future.
Study designs
Q: What are the different types of study design ? Q: What do they mean?
1. RetrospectiveStudy design
A study design that traces back into the past.
2. Prospectivestudy design
A study design that looks into the future.
Regression Analysis III Introduction to Study designs 3 / 18
Study designs
Q: What are the different types of study design ? Q: What do they mean?
1. RetrospectiveStudy design
A study design that traces back into the past.
2. Prospectivestudy design
A study design that looks into the future.
Study designs
Q: What are the different types of study design ? Q: What do they mean?
1. RetrospectiveStudy design
A study design that traces back into the past.
2. Prospectivestudy design
A study design that looks into the future.
Regression Analysis III Introduction to Study designs 3 / 18
Study designs
Q: What are the different types of study design ? Q: What do they mean?
1. RetrospectiveStudy design
A study design that traces back into the past.
2. Prospectivestudy design
A study design that looks into the future.
Types of Studies
Case-control study
A study which uses a retrospective design to trace back the past conditions of its sample units is called a case-control study.
I Case-control studies are very common in epidemiological or clinical studies.
I The sampled units under the diseased group are calledcases and the sampled units in the non-diseased group are called controls.
I When the two samples are matched in a case-control study it is called amatched case-control study.
I But when the two samples of cases and controls are not matched i.e. they are independently sampled, then it is called an unmatched case-control study.
Regression Analysis III Types of Studies 4 / 18
Types of Studies
Case-control study
A study which uses a retrospective design to trace back the past conditions of its sample units is called a case-control study.
I Case-control studies are very common in epidemiological or clinical studies.
I The sampled units under the diseased group are calledcases and the sampled units in the non-diseased group are called controls.
I When the two samples are matched in a case-control study it is called amatched case-control study.
I But when the two samples of cases and controls are not matched i.e. they are independently sampled, then it is called an unmatched case-control study.
Types of Studies
Case-control study
A study which uses a retrospective design to trace back the past conditions of its sample units is called a case-control study.
I Case-control studies are very common in epidemiological or clinical studies.
I The sampled units under the diseased group are calledcases and the sampled units in the non-diseased group are called controls.
I When the two samples are matched in a case-control study it is called amatched case-control study.
I But when the two samples of cases and controls are not matched i.e. they are independently sampled, then it is called an unmatched case-control study.
Regression Analysis III Types of Studies 4 / 18
Types of Studies
Case-control study
A study which uses a retrospective design to trace back the past conditions of its sample units is called a case-control study.
I Case-control studies are very common in epidemiological or clinical studies.
I The sampled units under the diseased group are calledcases and the sampled units in the non-diseased group are called controls.
I When the two samples are matched in a case-control study it is called amatched case-control study.
I But when the two samples of cases and controls are not matched i.e. they are independently sampled, then it is called an unmatched case-control study.
Example of a retrospective study
Problem: To establish association between smoking and lung-cancer. 1
Study design: In a data reported in the British Medical Journal in 1950, 709 cases of lung-cancer and 709 controls were sampled from 20 hospitals in London, England and were interviewed about their smoking behaviour.
The cases were sampled from patients who were admitted with lung cancer in the preceding year and the controls were sampled from non cancer patients at the same hospital of the same gender and within the same 5-year grouping of age.
A smoker was defined as a person who had smoked at least one cigarette a day for at least a year.
1Agresti, A. (2014). Categorical data analysis. John Wiley Sons.
Regression Analysis III Types of Studies 5 / 18
Example
This is a matched case-control study.
Y: occurrence of lung-cancer X: smoking behaviour
Table:Contingency table of smoking by lung cancer.
Lung Cancer Smoker Cases Controls
Yes 688 650
No 21 59
Types of Studies continued ...
I A study design that follows its experimental subjects into the future to observe whether they develop the disease or not, is called a prospective study design.
I Prospective studies are of two types :
I Cohort studieswhere the subject decides whether they want to be exposed to a certain condition (of the predictor variable) or not.
I Cross-sectional studieswhere the samples are classified into the different exposure conditions during the design stage of the experiment.
I Cohort and cross-sectional studies are very common in clinical studies.
Regression Analysis III Types of Studies 7 / 18
Types of Studies continued ...
I A study design that follows its experimental subjects into the future to observe whether they develop the disease or not, is called a prospective study design.
I Prospective studies are of two types :
I Cohort studieswhere the subject decides whether they want to be exposed to a certain condition (of the predictor variable) or not.
I Cross-sectional studieswhere the samples are classified into the different exposure conditions during the design stage of the experiment.
I Cohort and cross-sectional studies are very common in clinical studies.
Types of Studies continued ...
I A study design that follows its experimental subjects into the future to observe whether they develop the disease or not, is called a prospective study design.
I Prospective studies are of two types :
I Cohort studieswhere the subject decides whether they want to be exposed to a certain condition (of the predictor variable) or not.
I Cross-sectional studieswhere the samples are classified into the different exposure conditions during the design stage of the experiment.
I Cohort and cross-sectional studies are very common in clinical studies.
Regression Analysis III Types of Studies 7 / 18
Types of Studies continued ...
I A study design that follows its experimental subjects into the future to observe whether they develop the disease or not, is called a prospective study design.
I Prospective studies are of two types :
I Cohort studieswhere the subject decides whether they want to be exposed to a certain condition (of the predictor variable) or not.
I Cross-sectional studieswhere the samples are classified into the different exposure conditions during the design stage of the experiment.
I Cohort and cross-sectional studies are very common in clinical studies.
Example of a prospective study design
Objective: The Physicians’ Health Study Research Group at Harvard Medical School conducted a study to find out whether regular aspirin intake reduces mortality from cardiovascular disease.2
Study design: It was a blinded 5-year randomized study where the physicians in the study took either one aspirin tablet or a placebo every alternate day. At then end of the study it as observed how many subjects had heart attack, fatal or non-fatal.
This is an example of a cross-sectional study.
2Agresti, A. (2014). Categorical data analysis. John Wiley
Regression Analysis III Types of Studies 8 / 18
Example
Y: Had heart attack during the course of the study X: Aspirin use
Table:Contingency table of Aspirin use and Myocardial Infarction .
Myocardial Infarction Smoker Heart attack No attack
Aspirin 104 10845
Placebo 189 10933
Invariance of OR to study design
Let us consider a contingency table of probabilities of smoking vs.
lung cancer.
Also let us consider the example from the perspective of a prospective as well as a retrospective study design.
Y: occurrence of lung-cancer X: smoking behaviour Table:Contingency table of smoking by lung cancer.
Lung Cancer
Smoker Yes No
Yes π11 π12
No π21 π22
Regression Analysis III Invariance of OR to study design 10 / 18
Invariance of OR to study design
Let us consider a contingency table of probabilities of smoking vs.
lung cancer.
Also let us consider the example from the perspective of a prospective as well as a retrospective study design.
Y: occurrence of lung-cancer X: smoking behaviour Table:Contingency table of smoking by lung cancer.
Lung Cancer
Smoker Yes No
Yes π11 π12
No π21 π22
Invariance of OR to study design
Let us consider a contingency table of probabilities of smoking vs.
lung cancer.
Also let us consider the example from the perspective of a prospective as well as a retrospective study design.
Y: occurrence of lung-cancer X: smoking behaviour Table:Contingency table of smoking by lung cancer.
Lung Cancer
Smoker Yes No
Yes π11 π12
No π21 π22
Regression Analysis III Invariance of OR to study design 10 / 18
OR for Prospective study design
I Case I : Let us assume a prospective study design for the example just stated.
I So this means a population of smokers and non-smokers were followed into the future to find out who develops lung cancer (L.C.) and who does not, and then the corresponding
probabilities were tabulated.
I log of odds of developing L.C. among smokers = logππ11
12
I log of odds of developing L.C among non-smokers = logππ21
22
I log (Odds Ratio) = logππ11
12 - logππ21
22
OR for Prospective study design
I Case I : Let us assume a prospective study design for the example just stated.
I So this means a population of smokers and non-smokers were followed into the future to find out who develops lung cancer (L.C.) and who does not, and then the corresponding
probabilities were tabulated.
I log of odds of developing L.C. among smokers = logππ11
12
I log of odds of developing L.C among non-smokers = logππ21
22
I log (Odds Ratio) = logππ11
12 - logππ21
22
Regression Analysis III Invariance of OR to study design 11 / 18
OR for Prospective study design
I Case I : Let us assume a prospective study design for the example just stated.
I So this means a population of smokers and non-smokers were followed into the future to find out who develops lung cancer (L.C.) and who does not, and then the corresponding
probabilities were tabulated.
I log of odds of developing L.C. among smokers = logππ11
12
I log of odds of developing L.C among non-smokers = logππ21
22
I log (Odds Ratio) = logππ11
12 - logππ21
22
OR for Prospective study design
I Case I : Let us assume a prospective study design for the example just stated.
I So this means a population of smokers and non-smokers were followed into the future to find out who develops lung cancer (L.C.) and who does not, and then the corresponding
probabilities were tabulated.
I log of odds of developing L.C. among smokers = logππ11
12
I log of odds of developing L.C among non-smokers = logππ21
22
I log (Odds Ratio) = logππ11
12 - logππ21
22
Regression Analysis III Invariance of OR to study design 11 / 18
OR for Prospective study design
I Case I : Let us assume a prospective study design for the example just stated.
I So this means a population of smokers and non-smokers were followed into the future to find out who develops lung cancer (L.C.) and who does not, and then the corresponding
probabilities were tabulated.
I log of odds of developing L.C. among smokers = logππ11
12
I log of odds of developing L.C among non-smokers = logππ21
22
I log (Odds Ratio) = logππ11
12 - logππ21
22
OR for Retrospective study design
I Case II : Now let us assume a retrospective study design for the same example.
I So this means a population of lung cancer patients were traced back in time to find out who were smokers and who were not.
I log of odds that a lung cancer patient was a smoker = logππ11
21
I log of odds that a non lung cancer patient was a smoker = logππ12
22
I log (Odds Ratio) = logππ11
21 - logππ12
22
Regression Analysis III Invariance of OR to study design 12 / 18
OR for Retrospective study design
I Case II : Now let us assume a retrospective study design for the same example.
I So this means a population of lung cancer patients were traced back in time to find out who were smokers and who were not.
I log of odds that a lung cancer patient was a smoker = logππ11
21
I log of odds that a non lung cancer patient was a smoker = logππ12
22
I log (Odds Ratio) = logππ11
21 - logππ12
22
OR for Retrospective study design
I Case II : Now let us assume a retrospective study design for the same example.
I So this means a population of lung cancer patients were traced back in time to find out who were smokers and who were not.
I log of odds that a lung cancer patient was a smoker = logππ11
21
I log of odds that a non lung cancer patient was a smoker = logππ12
22
I log (Odds Ratio) = logππ11
21 - logππ12
22
Regression Analysis III Invariance of OR to study design 12 / 18
OR for Retrospective study design
I Case II : Now let us assume a retrospective study design for the same example.
I So this means a population of lung cancer patients were traced back in time to find out who were smokers and who were not.
I log of odds that a lung cancer patient was a smoker = logππ11
21
I log of odds that a non lung cancer patient was a smoker = logππ12
22
I log (Odds Ratio) = logππ11
21 - logππ12
22
OR for Retrospective study design
I Case II : Now let us assume a retrospective study design for the same example.
I So this means a population of lung cancer patients were traced back in time to find out who were smokers and who were not.
I log of odds that a lung cancer patient was a smoker = logππ11
21
I log of odds that a non lung cancer patient was a smoker = logππ12
22
I log (Odds Ratio) = logππ11
21 - logππ12
22
Regression Analysis III Invariance of OR to study design 12 / 18
Invariance of OR to study design
I Hence it is quite obvious that both the designs lead us to the same population Odds ratio.
I Thus while calculating Odds Ratio it does not matter how the sampling can be drawn.
I We would reach the same conclusion for sample Odds ratio also if we consider a contingency table of sample counts.
I Hence Odds Ratio is invariant to study design.
Invariance of OR to study design
I Hence it is quite obvious that both the designs lead us to the same population Odds ratio.
I Thus while calculating Odds Ratio it does not matter how the sampling can be drawn.
I We would reach the same conclusion for sample Odds ratio also if we consider a contingency table of sample counts.
I Hence Odds Ratio is invariant to study design.
Regression Analysis III Invariance of OR to study design 13 / 18
Invariance of OR to study design
I Hence it is quite obvious that both the designs lead us to the same population Odds ratio.
I Thus while calculating Odds Ratio it does not matter how the sampling can be drawn.
I We would reach the same conclusion for sample Odds ratio also if we consider a contingency table of sample counts.
I Hence Odds Ratio is invariant to study design.
Invariance of OR to study design
I Hence it is quite obvious that both the designs lead us to the same population Odds ratio.
I Thus while calculating Odds Ratio it does not matter how the sampling can be drawn.
I We would reach the same conclusion for sample Odds ratio also if we consider a contingency table of sample counts.
I Hence Odds Ratio is invariant to study design.
Regression Analysis III Invariance of OR to study design 13 / 18
Summary
I There are two types of study designs :
I retrospective - a study design that looks into the past.
I prospective - a study design that looks into the future.
I Retrospective studies : Case-control studies
I Prospective studies : Cohort studies and Cross-sectional studies.
I Odds Ratio is invariant to study design.
Summary
I There are two types of study designs :
I retrospective - a study design that looks into the past.
I prospective - a study design that looks into the future.
I Retrospective studies : Case-control studies
I Prospective studies : Cohort studies and Cross-sectional studies.
I Odds Ratio is invariant to study design.
Regression Analysis III Summary 14 / 18
Summary
I There are two types of study designs :
I retrospective - a study design that looks into the past.
I prospective - a study design that looks into the future.
I Retrospective studies : Case-control studies
I Prospective studies : Cohort studies and Cross-sectional studies.
I Odds Ratio is invariant to study design.
Summary
I There are two types of study designs :
I retrospective - a study design that looks into the past.
I prospective - a study design that looks into the future.
I Retrospective studies : Case-control studies
I Prospective studies : Cohort studies and Cross-sectional studies.
I Odds Ratio is invariant to study design.
Regression Analysis III Summary 14 / 18
Summary
I There are two types of study designs :
I retrospective - a study design that looks into the past.
I prospective - a study design that looks into the future.
I Retrospective studies : Case-control studies
I Prospective studies : Cohort studies and Cross-sectional studies.
I Odds Ratio is invariant to study design.
Summary
I There are two types of study designs :
I retrospective - a study design that looks into the past.
I prospective - a study design that looks into the future.
I Retrospective studies : Case-control studies
I Prospective studies : Cohort studies and Cross-sectional studies.
I Odds Ratio is invariant to study design.
Regression Analysis III Summary 14 / 18
Example 1 3
## A cross sectional study investigating the relationship
## between dry cat food (DCF) and feline urologic syndrome
## (FUS) was conducted (Willeberg 1977). Counts of
## individuals in each group were as follows:
## DCF-exposed cats (cases, non-cases) 13, 2163
## Non DCF-exposed cats (cases, non-cases) 5, 3349
## Outcome variable (FUS) as columns:
dat <- matrix(c(13,2163,5,3349), nrow = 2, byrow = TRUE) rownames(dat) <- c("DF+", "DF-");
colnames(dat) <- c("FUS+", "FUS-"); dat
epi.2by2(dat = as.table(dat), method = "cross.sectional", conf.level = 0.95, units = 100, outcome = "as.columns")
3help file of the function epi.2by2 in the library epiR, R Documentation
Example 1 contd ...
## Outcome variable (FUS) as rows:
dat <- matrix(c(13,5,2163,3349), nrow = 2, byrow = TRUE) rownames(dat) <- c("FUS+", "FUS-");
colnames(dat) <- c("DF+", "DF-"); dat
epi.2by2(dat = as.table(dat), method = "cross.sectional", conf.level = 0.95, units = 100, outcome = "as.rows")
## Odds Ratio remains the same even with change of rows
## and columns. This indicates odds ratio remains
## unchanged even when we interchange response and
## predictor.
Regression Analysis III Sample R script to complement this module 16 / 18
Example 2 4
## Prospective study babyfood[c(1,3),]
# the log-odds of having a respiratory disease for
# breastfed infants,are
# log47/447=-2.25
# the log-odds of having a respiratory disease for
# bottle-fed infants,are
# log47/447=-2.25
# log(OR)= -1.60- -2.25=0.65
# represents the increased risk of respiratory disease
# incurred by bottle feeding relative to breast feeding
4Faraway, J.J. (2006). Extending the Linear Model with R. Chapman Hall/CRC
Example 2 contd ...
## Retrospective study
# Had this same data been from a retrospective study,
# the log of odds-ratio is the difference of log odds
# of feeding type given the disease status.
# log(OR)=log77/47-log381/447=log77/381-log47/447=0.65
# which is the same as the odds-ratio for the prospective
# study discussed above.
Regression Analysis III Sample R script to complement this module 18 / 18