• No results found

Module: Simpson’s Paradox

N/A
N/A
Protected

Academic year: 2022

Share "Module: Simpson’s Paradox"

Copied!
36
0
0

Loading.... (view fulltext now)

Full text

(1)

Paper: Regression Analysis III

Module: Simpson’s Paradox

(2)

Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta

Paper co-ordinator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta

Content writer: Sayantee Jana, Graduate student, Department of Mathematics and Statistics, McMaster University

Content reviewer: Department of Statistics, University of Calcutta

(3)

What is a Simpson’s Paradox

Sometimes the result from a marginal association can have a different direction of inference than the inference from each conditional association. This phenomenon is called Simpson’s paradox. Although the paradox is named after Simpson (1951), but it dates back to Yule (1903).

(4)

I This phenomenon can be found to occur in both quantitative as well as categorical data.

I Statisticians commonly use it to caution against imputing causal effects from an association between the response and predictor variables.

I They say that although we might come across very strongly associated response and predictor variables but there might exist one or more confounding factor which might alter the results altogether. Hence controlling for the relevant factors is necessary to study the true association.

(5)

I This phenomenon can be found to occur in both quantitative as well as categorical data.

I Statisticians commonly use it to caution against imputing causal effects from an association between the response and predictor variables.

I They say that although we might come across very strongly associated response and predictor variables but there might exist one or more confounding factor which might alter the results altogether. Hence controlling for the relevant factors is necessary to study the true association.

(6)

I This phenomenon can be found to occur in both quantitative as well as categorical data.

I Statisticians commonly use it to caution against imputing causal effects from an association between the response and predictor variables.

I They say that although we might come across very strongly associated response and predictor variables but there might exist one or more confounding factor which might alter the results altogether. Hence controlling for the relevant factors is necessary to study the true association.

(7)

Let us consider the following 2 X 2 contingency table of a study on Conviction status by race of defendant and victim.

Table:Death Penalty Verdict by Defendant’s Race

Death Penalty Defendant’s Race Yes No

White 53 430

Black 15 176

Total no. of cases = 674

(8)

I Y: death penalty verdict, (categories - yes and no)

I X: race of defendant (categories - black or white)

I Objective of the study : to study the effect of defendant’s race on the death penalty verdict

I White defendants - 11%

I Black defendants -7.9%

(9)

I Y: death penalty verdict, (categories - yes and no)

I X: race of defendant (categories - black or white)

I Objective of the study : to study the effect of defendant’s race on the death penalty verdict

I White defendants - 11%

I Black defendants -7.9%

(10)

I Y: death penalty verdict, (categories - yes and no)

I X: race of defendant (categories - black or white)

I Objective of the study : to study the effect of defendant’s race on the death penalty verdict

I White defendants - 11%

I Black defendants -7.9%

(11)

I Y: death penalty verdict, (categories - yes and no)

I X: race of defendant (categories - black or white)

I Objective of the study : to study the effect of defendant’s race on the death penalty verdict

I White defendants - 11%

I Black defendants -7.9%

(12)

I Y: death penalty verdict, (categories - yes and no)

I X: race of defendant (categories - black or white)

I Objective of the study : to study the effect of defendant’s race on the death penalty verdict

I White defendants - 11%

I Black defendants -7.9%

(13)

I Thus it seems that the death penalty was imposed less often on black defendants than on white defendants.

I Now let us consider another factor - victim’s race

I Z: race of victims (categories - black or white)

I Let us stratify the marginal table with respect to Z.

(14)

I Thus it seems that the death penalty was imposed less often on black defendants than on white defendants.

I Now let us consider another factor - victim’s race

I Z: race of victims (categories - black or white)

I Let us stratify the marginal table with respect to Z.

(15)

I Thus it seems that the death penalty was imposed less often on black defendants than on white defendants.

I Now let us consider another factor - victim’s race

I Z: race of victims (categories - black or white)

I Let us stratify the marginal table with respect to Z.

(16)

I Thus it seems that the death penalty was imposed less often on black defendants than on white defendants.

I Now let us consider another factor - victim’s race

I Z: race of victims (categories - black or white)

I Let us stratify the marginal table with respect to Z.

(17)

Let us consider the partial table for white victims only.

Table:Death Penalty Verdict by Defendant’s Race

Death Penalty Defendant’s Race Yes No

White 53 414

Black 11 37

Total no. of cases = 515

(18)

I White defendants - 11.3%

I Black defendants -22.9%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

(19)

I White defendants - 11.3%

I Black defendants -22.9%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

(20)

I White defendants - 11.3%

I Black defendants -22.9%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

(21)

Let us consider the partial table for black victims only.

Table:Death Penalty Verdict by Defendant’s Race

Death Penalty Defendant’s Race Yes No

White 0 16

Black 4 139

Total no. of cases = 159

(22)

I White defendants - 0%

I Black defendants -2.8%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

I Thus by controlling for victims’ race, it was found that death penalty was imposed more often on black defendants than on white defendants.

I But if we ignore victim’s race the conclusion is reversed.

(23)

I White defendants - 0%

I Black defendants -2.8%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

I Thus by controlling for victims’ race, it was found that death penalty was imposed more often on black defendants than on white defendants.

I But if we ignore victim’s race the conclusion is reversed.

(24)

I White defendants - 0%

I Black defendants -2.8%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

I Thus by controlling for victims’ race, it was found that death penalty was imposed more often on black defendants than on white defendants.

I But if we ignore victim’s race the conclusion is reversed.

(25)

I White defendants - 0%

I Black defendants -2.8%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

I Thus by controlling for victims’ race, it was found that death penalty was imposed more often on black defendants than on white defendants.

I But if we ignore victim’s race the conclusion is reversed.

(26)

I White defendants - 0%

I Black defendants -2.8%

I Conclusion : death penalty was imposed more often on black defendants than on white defendants.

I Thus by controlling for victims’ race, it was found that death penalty was imposed more often on black defendants than on white defendants.

I But if we ignore victim’s race the conclusion is reversed.

(27)

I The association betweenX and Y changes when we ignore victims’ race versus control it.

I This is due to the nature of the association between Z and each of X and Y.

I The association betweenZ andX is extremely strong.

I The odds ratio betweenX andZ is (467 X 143)/(48 X 16)=

87.0.

(28)

I The association betweenX and Y changes when we ignore victims’ race versus control it.

I This is due to the nature of the association between Z and each of X and Y.

I The association betweenZ andX is extremely strong.

I The odds ratio betweenX andZ is (467 X 143)/(48 X 16)=

87.0.

(29)

I The association betweenX and Y changes when we ignore victims’ race versus control it.

I This is due to the nature of the association between Z and each of X and Y.

I The association betweenZ andX is extremely strong.

I The odds ratio betweenX andZ is (467 X 143)/(48 X 16)=

87.0.

(30)

I The association betweenX and Y changes when we ignore victims’ race versus control it.

I This is due to the nature of the association between Z and each of X and Y.

I The association betweenZ andX is extremely strong.

I The odds ratio betweenX andZ is (467 X 143)/(48 X 16)=

87.0.

(31)

I From the marginal table we see that regardless of defendant’s race, the death penalty was much more likely when the victims were white than when the victims were black.

I So whites are more likely to kill whites resulting in death penalty.

I Thus marginal association favours white defendants (for death penalty) than the conditional associations.

(32)

I From the marginal table we see that regardless of defendant’s race, the death penalty was much more likely when the victims were white than when the victims were black.

I So whites are more likely to kill whites resulting in death penalty.

I Thus marginal association favours white defendants (for death penalty) than the conditional associations.

(33)

I From the marginal table we see that regardless of defendant’s race, the death penalty was much more likely when the victims were white than when the victims were black.

I So whites are more likely to kill whites resulting in death penalty.

I Thus marginal association favours white defendants (for death penalty) than the conditional associations.

(34)

I Simpson’s paradox occurs when the results of association between the response or predictor variable changes due to the introduction of another variable.

I Usually this variable is a confounder.

I Simpson’s paradox can occur in both categorical and continuous data.

(35)

I Simpson’s paradox occurs when the results of association between the response or predictor variable changes due to the introduction of another variable.

I Usually this variable is a confounder.

I Simpson’s paradox can occur in both categorical and continuous data.

(36)

I Simpson’s paradox occurs when the results of association between the response or predictor variable changes due to the introduction of another variable.

I Usually this variable is a confounder.

I Simpson’s paradox can occur in both categorical and continuous data.

References

Related documents

Dual Degree Proposed Course Structure Commerce Courses Offered By the Department of Accountancy & Law and Applied Business Economics.. Minutes of the meeting of the Board

The necessary set of data includes a panel of country-level exports from Sub-Saharan African countries to the United States; a set of macroeconomic variables that would

Percentage of countries with DRR integrated in climate change adaptation frameworks, mechanisms and processes Disaster risk reduction is an integral objective of

The occurrence of mature and spent specimens of Thrissina baelama in different size groups indicated that the fish matures at an average length of 117 nun (TL).. This is sup- ported

INDEPENDENT MONITORING BOARD | RECOMMENDED ACTION.. Rationale: Repeatedly, in field surveys, from front-line polio workers, and in meeting after meeting, it has become clear that

Based on the call for a more nuanced understanding of illegal wildlife trade and why individuals engage in these activities, this study interviewed 73 convicted wildlife

21. (1)  Any  public  official  or  any  other  person  having  information  of  any  corruption 

22 induction of endothelial inflammation in several organs as a direct consequence of viral cytotoxic effects and the host inflammatory response, which can