• No results found

Bayes Factors

N/A
N/A
Protected

Academic year: 2022

Share "Bayes Factors"

Copied!
59
0
0

Loading.... (view fulltext now)

Full text

(1)

Subject: Statistics

Paper: Statistical Inference

Module: Bayesian Hypothesis Testing and

Bayes Factors

(2)

Development Team

Principal investigator: Dr. Bhaswati Ganguli,Professor, Department of Statistics, University of Calcutta Paper co-ordinator: Dr. Dipak K Dey,Associate Dean and BOT

Distinguished Professor, Department of Statistics, University of Connecticut

Content writer: Dr. Sourish Das,Assistant Professor, Chennai Mathematical Institute

Content reviewer: Department of Statistics, University of Calcutta

2 / 17

(3)

Outline

1. Bayesian p-values

2. Bayes Factors for model comparison

3. Easy to implement alternatives for model comparison

(4)

Outline

1. Bayesian p-values

2. Bayes Factors for model comparison

3. Easy to implement alternatives for model comparison

3 / 17

(5)

Outline

1. Bayesian p-values

2. Bayes Factors for model comparison

3. Easy to implement alternatives for model comparison

(6)

Bayesian Hypothesis Testing

I Bayesian hypothesis testing is less formal than frequentist approach.

I In fact, Bayesian researchers typically summarize the posterior distribution without applying a rigid decision process.

I If one wanted to apply a formal process, Bayesian decision theory is the way to go because it is possible to get a

probability distribution over the parameter space and one can make expected utility calculations based on the costs and benefits of different outcomes.

I Considerable energy has been given, however, in trying to map Bayesian statistical models into the null hypothesis hypothesis testing framework, with mixed results at best.

4 / 17

(7)

Bayesian Hypothesis Testing

I Bayesian hypothesis testing is less formal than frequentist approach.

I In fact, Bayesian researchers typically summarize the posterior distribution without applying a rigid decision process.

I If one wanted to apply a formal process, Bayesian decision theory is the way to go because it is possible to get a

probability distribution over the parameter space and one can make expected utility calculations based on the costs and benefits of different outcomes.

I Considerable energy has been given, however, in trying to map Bayesian statistical models into the null hypothesis hypothesis testing framework, with mixed results at best.

(8)

Bayesian Hypothesis Testing

I Bayesian hypothesis testing is less formal than frequentist approach.

I In fact, Bayesian researchers typically summarize the posterior distribution without applying a rigid decision process.

I If one wanted to apply a formal process, Bayesian decision theory is the way to go because it is possible to get a

probability distribution over the parameter space and one can make expected utility calculations based on the costs and benefits of different outcomes.

I Considerable energy has been given, however, in trying to map Bayesian statistical models into the null hypothesis hypothesis testing framework, with mixed results at best.

4 / 17

(9)

Bayesian Hypothesis Testing

I Bayesian hypothesis testing is less formal than frequentist approach.

I In fact, Bayesian researchers typically summarize the posterior distribution without applying a rigid decision process.

I If one wanted to apply a formal process, Bayesian decision theory is the way to go because it is possible to get a

probability distribution over the parameter space and one can make expected utility calculations based on the costs and benefits of different outcomes.

I Considerable energy has been given, however, in trying to map Bayesian statistical models into the null hypothesis hypothesis testing framework, with mixed results at best.

(10)

Similarities between Bayesian and Frequentist Hypothesis Testing

I Maximum likelihood estimates of parameter means and standard errors and Bayesian estimates with flat priors are equivalent.

I Asymptotically, the data will overwhelm the choice of prior, so if we had infinite data sets, priors would be irrelevant and Bayesian and frequentist results would converge.

I Frequentist one-tailed tests are basically equivalent to what a Bayesian would get using credible intervals.

5 / 17

(11)

Similarities between Bayesian and Frequentist Hypothesis Testing

I Maximum likelihood estimates of parameter means and standard errors and Bayesian estimates with flat priors are equivalent.

I Asymptotically, the data will overwhelm the choice of prior, so if we had infinite data sets, priors would be irrelevant and Bayesian and frequentist results would converge.

I Frequentist one-tailed tests are basically equivalent to what a Bayesian would get using credible intervals.

(12)

Similarities between Bayesian and Frequentist Hypothesis Testing

I Maximum likelihood estimates of parameter means and standard errors and Bayesian estimates with flat priors are equivalent.

I Asymptotically, the data will overwhelm the choice of prior, so if we had infinite data sets, priors would be irrelevant and Bayesian and frequentist results would converge.

I Frequentist one-tailed tests are basically equivalent to what a Bayesian would get using credible intervals.

5 / 17

(13)

Differences between Frequentist and Bayesian Hypothesis Testing

I The most important pragmatic difference between Bayesian and frequentist hypothesis testing is that Bayesian methods are poorly suited for two-tailed tests.

I Why? Because the probability of zero in a continuous distribution is zero.

The best solution proposed so far is to calculate the probability that, say, a regression coefficient is in some range near zero.

e.g., two-sided p-value=P(−e < B < e)

I However, the choice ofeseems very ad hoc unless there is some decision theoretic basis.

I The other important difference is more philosophical.

Frequentist p-values violate the likelihood principle.

(14)

Differences between Frequentist and Bayesian Hypothesis Testing

I The most important pragmatic difference between Bayesian and frequentist hypothesis testing is that Bayesian methods are poorly suited for two-tailed tests.

I Why? Because the probability of zero in a continuous distribution is zero.

The best solution proposed so far is to calculate the probability that, say, a regression coefficient is in some range near zero.

e.g., two-sided p-value=P(−e < B < e)

I However, the choice ofeseems very ad hoc unless there is some decision theoretic basis.

I The other important difference is more philosophical.

Frequentist p-values violate the likelihood principle.

6 / 17

(15)

Differences between Frequentist and Bayesian Hypothesis Testing

I The most important pragmatic difference between Bayesian and frequentist hypothesis testing is that Bayesian methods are poorly suited for two-tailed tests.

I Why? Because the probability of zero in a continuous distribution is zero.

The best solution proposed so far is to calculate the probability that, say, a regression coefficient is in some range near zero.

e.g., two-sided p-value=P(−e < B < e)

I However, the choice ofeseems very ad hoc unless there is some decision theoretic basis.

I The other important difference is more philosophical.

Frequentist p-values violate the likelihood principle.

(16)

Differences between Frequentist and Bayesian Hypothesis Testing

I The most important pragmatic difference between Bayesian and frequentist hypothesis testing is that Bayesian methods are poorly suited for two-tailed tests.

I Why? Because the probability of zero in a continuous distribution is zero.

The best solution proposed so far is to calculate the probability that, say, a regression coefficient is in some range near zero.

e.g., two-sided p-value=P(−e < B < e)

I However, the choice ofeseems very ad hoc unless there is some decision theoretic basis.

I The other important difference is more philosophical.

Frequentist p-values violate the likelihood principle.

6 / 17

(17)

Differences between Frequentist and Bayesian Hypothesis Testing

I The most important pragmatic difference between Bayesian and frequentist hypothesis testing is that Bayesian methods are poorly suited for two-tailed tests.

I Why? Because the probability of zero in a continuous distribution is zero.

The best solution proposed so far is to calculate the probability that, say, a regression coefficient is in some range near zero.

e.g., two-sided p-value=P(−e < B < e)

I However, the choice ofeseems very ad hoc unless there is some decision theoretic basis.

I The other important difference is more philosophical.

Frequentist p-values violate the likelihood principle.

(18)

Differences between Frequentist and Bayesian Hypothesis Testing

I The most important pragmatic difference between Bayesian and frequentist hypothesis testing is that Bayesian methods are poorly suited for two-tailed tests.

I Why? Because the probability of zero in a continuous distribution is zero.

The best solution proposed so far is to calculate the probability that, say, a regression coefficient is in some range near zero.

e.g., two-sided p-value=P(−e < B < e)

I However, the choice ofeseems very ad hoc unless there is some decision theoretic basis.

I The other important difference is more philosophical.

Frequentist p-values violate the likelihood principle.

6 / 17

(19)

Bayes Factors

I Bayes Factors are the dominant method of Bayesian model testing. They are the Bayesian analogues of likelihood ratio tests.

I The basic intuition is that prior and posterior information are combined in a ratio that provides evidence in favor of one model specification verses another.

I Bayes Factors are very flexible, allowing multiple hypotheses to be compared simultaneously and nested models are not required in order to make comparisons−→ it goes without saying that compared models should obviously have the same dependent variable.

(20)

Bayes Factors

I Bayes Factors are the dominant method of Bayesian model testing. They are the Bayesian analogues of likelihood ratio tests.

I The basic intuition is that prior and posterior information are combined in a ratio that provides evidence in favor of one model specification verses another.

I Bayes Factors are very flexible, allowing multiple hypotheses to be compared simultaneously and nested models are not required in order to make comparisons−→ it goes without saying that compared models should obviously have the same dependent variable.

7 / 17

(21)

Bayes Factors

I Bayes Factors are the dominant method of Bayesian model testing. They are the Bayesian analogues of likelihood ratio tests.

I The basic intuition is that prior and posterior information are combined in a ratio that provides evidence in favor of one model specification verses another.

I Bayes Factors are very flexible, allowing multiple hypotheses to be compared simultaneously and nested models are not required in order to make comparisons−→ it goes without saying that compared models should obviously have the same dependent variable.

(22)

The General Form for Bayes Factors

I Suppose that we observe data X and with to test two competing models - M1 andM2 relating these data to two different sets of parameters θ1 andθ2.

I We would like to know which of the following likelihood specifications is better:

M1 : f1(x|θ1) andM2 : f2(x|θ2)

I Obviously, we would need prior distributions for the θ1 andθ2 and prior probabilities forM1 andM2

8 / 17

(23)

The General Form for Bayes Factors

I Suppose that we observe data X and with to test two competing models - M1 andM2 relating these data to two different sets of parameters θ1 andθ2.

I We would like to know which of the following likelihood specifications is better:

M1 : f1(x|θ1) andM2 : f2(x|θ2)

I Obviously, we would need prior distributions for the θ1 andθ2 and prior probabilities forM1 andM2

(24)

The General Form for Bayes Factors

I Suppose that we observe data X and with to test two competing models - M1 andM2 relating these data to two different sets of parameters θ1 andθ2.

I We would like to know which of the following likelihood specifications is better:

M1 : f1(x|θ1) andM2 : f2(x|θ2)

I Obviously, we would need prior distributions for the θ1 andθ2 and prior probabilities forM1 andM2

8 / 17

(25)

The General Form for Bayes Factors

I The posterior odds ratio in favor of M1 overM2 is:

Posterior Odds=Prior Odds/Data ×Bayes factor π(M1|x)

π(M2|x) = p(M1)/p(x) p(M2)/p(x) ×

R

θ1f1(x|θ1)p11)dθ1

R

θ2f2(x|θ2)p22)dθ2

I Rearranging terms, we find the Bayes’ factor is:

Bayes Factor=B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I If we have nested models andP(M1) =P(M2) = 0.5then the Bayes factor reduces to likelihood ratio

(26)

The General Form for Bayes Factors

I The posterior odds ratio in favor of M1 overM2 is:

Posterior Odds=Prior Odds/Data ×Bayes factor π(M1|x)

π(M2|x) = p(M1)/p(x) p(M2)/p(x) ×

R

θ1f1(x|θ1)p11)dθ1

R

θ2f2(x|θ2)p22)dθ2

I Rearranging terms, we find the Bayes’ factor is:

Bayes Factor=B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I If we have nested models andP(M1) =P(M2) = 0.5then the Bayes factor reduces to likelihood ratio

9 / 17

(27)

The General Form for Bayes Factors

I The posterior odds ratio in favor of M1 overM2 is:

Posterior Odds=Prior Odds/Data ×Bayes factor π(M1|x)

π(M2|x) = p(M1)/p(x) p(M2)/p(x) ×

R

θ1f1(x|θ1)p11)dθ1

R

θ2f2(x|θ2)p22)dθ2

I Rearranging terms, we find the Bayes’ factor is:

Bayes Factor=B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I If we have nested models andP(M1) =P(M2) = 0.5then the Bayes factor reduces to likelihood ratio

(28)

The General Form for Bayes Factors

I The posterior odds ratio in favor of M1 overM2 is:

Posterior Odds=Prior Odds/Data ×Bayes factor π(M1|x)

π(M2|x) = p(M1)/p(x) p(M2)/p(x) ×

R

θ1f1(x|θ1)p11)dθ1

R

θ2f2(x|θ2)p22)dθ2

I Rearranging terms, we find the Bayes’ factor is:

Bayes Factor=B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I If we have nested models andP(M1) =P(M2) = 0.5then the Bayes factor reduces to likelihood ratio

9 / 17

(29)

Rule of Thumb

I Bayes Factor :

B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I With this setup, if we interpret model 1 as the null model, then:

1. IfB(x)1then model 1 is supported

2. If1> B(x)10−1/2then minimal evidence against model 1.

3. If10−1/2> B(x)10−1 then substantial evidence against model 1.

4. If10−1> B(x)10−2 then strong evidence against model 1.

5. If10−2> B(x)then decisive evidence against model 1.

(30)

Rule of Thumb

I Bayes Factor :

B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I With this setup, if we interpret model 1 as the null model, then:

1. IfB(x)1then model 1 is supported

2. If1> B(x)10−1/2then minimal evidence against model 1.

3. If10−1/2> B(x)10−1 then substantial evidence against model 1.

4. If10−1> B(x)10−2 then strong evidence against model 1.

5. If10−2> B(x)then decisive evidence against model 1.

10 / 17

(31)

Rule of Thumb

I Bayes Factor :

B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I With this setup, if we interpret model 1 as the null model, then:

1. IfB(x)1then model 1 is supported

2. If1> B(x)10−1/2then minimal evidence against model 1.

3. If10−1/2> B(x)10−1 then substantial evidence against model 1.

4. If10−1> B(x)10−2 then strong evidence against model 1.

5. If10−2> B(x)then decisive evidence against model 1.

(32)

Rule of Thumb

I Bayes Factor :

B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I With this setup, if we interpret model 1 as the null model, then:

1. IfB(x)1then model 1 is supported

2. If1> B(x)10−1/2then minimal evidence against model 1.

3. If10−1/2> B(x)10−1 then substantial evidence against model 1.

4. If10−1> B(x)10−2 then strong evidence against model 1.

5. If10−2> B(x)then decisive evidence against model 1.

10 / 17

(33)

Rule of Thumb

I Bayes Factor :

B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I With this setup, if we interpret model 1 as the null model, then:

1. IfB(x)1then model 1 is supported

2. If1> B(x)10−1/2then minimal evidence against model 1.

3. If10−1/2> B(x)10−1 then substantial evidence against model 1.

4. If10−1> B(x)10−2 then strong evidence against model 1.

5. If10−2> B(x)then decisive evidence against model 1.

(34)

Rule of Thumb

I Bayes Factor :

B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I With this setup, if we interpret model 1 as the null model, then:

1. IfB(x)1then model 1 is supported

2. If1> B(x)10−1/2then minimal evidence against model 1.

3. If10−1/2> B(x)10−1 then substantial evidence against model 1.

4. If10−1> B(x)10−2 then strong evidence against model 1.

5. If10−2> B(x)then decisive evidence against model 1.

10 / 17

(35)

Rule of Thumb

I Bayes Factor :

B(x) = π(M1|x)/p(M1) π(M2|x)/p(M2)

I With this setup, if we interpret model 1 as the null model, then:

1. IfB(x)1then model 1 is supported

2. If1> B(x)10−1/2then minimal evidence against model 1.

3. If10−1/2> B(x)10−1 then substantial evidence against model 1.

4. If10−1> B(x)10−2 then strong evidence against model 1.

5. If10−2> B(x)then decisive evidence against model 1.

(36)

The Bad News

I Unfortunately, while Bayes Factors are rather intuitive, as a practical matter they are often quite difficult to calculate.

I However, in the MCMCpack package Bayes Factor can be computed routinely for standard statistical models

I You also may want to use Carlin and Chib’s technique for computing Bayes Factors for competing non-nested regression models reported in Journal of Royal Statistical Society. Series B. vol 57:3 1995.

I Our discussion will focus on alternatives to the Bayes Factor.

11 / 17

(37)

The Bad News

I Unfortunately, while Bayes Factors are rather intuitive, as a practical matter they are often quite difficult to calculate.

I However, in the MCMCpack package Bayes Factor can be computed routinely for standard statistical models

I You also may want to use Carlin and Chib’s technique for computing Bayes Factors for competing non-nested regression models reported in Journal of Royal Statistical Society. Series B. vol 57:3 1995.

I Our discussion will focus on alternatives to the Bayes Factor.

(38)

The Bad News

I Unfortunately, while Bayes Factors are rather intuitive, as a practical matter they are often quite difficult to calculate.

I However, in the MCMCpack package Bayes Factor can be computed routinely for standard statistical models

I You also may want to use Carlin and Chib’s technique for computing Bayes Factors for competing non-nested regression models reported in Journal of Royal Statistical Society. Series B. vol 57:3 1995.

I Our discussion will focus on alternatives to the Bayes Factor.

11 / 17

(39)

The Bad News

I Unfortunately, while Bayes Factors are rather intuitive, as a practical matter they are often quite difficult to calculate.

I However, in the MCMCpack package Bayes Factor can be computed routinely for standard statistical models

I You also may want to use Carlin and Chib’s technique for computing Bayes Factors for competing non-nested regression models reported in Journal of Royal Statistical Society. Series B. vol 57:3 1995.

I Our discussion will focus on alternatives to the Bayes Factor.

(40)

Alternatives to the Bayes Factor for model assessmen

I Let θ denote your estimates of the parameter means (or medians or modes) in your model and suppose that the Bayes estimate is approximately equal to the maximum likelihood estimate, then the following stats used in frequentist statistics will be useful diagnostics.

I Good: The Likelihood Ratio

Ratio=−2[log L(θRestricted M odel |y)−log L(θF ull M odel |y)]

I This statistic will always favor the unrestricted model, but when the Bayes estimators or equivalent to the maximum likelihood estimates, then the Ratio is distributed as a χ2 where the number of degrees of freedom is equal to the number of test parameters.

12 / 17

(41)

Alternatives to the Bayes Factor for model assessmen

I Let θ denote your estimates of the parameter means (or medians or modes) in your model and suppose that the Bayes estimate is approximately equal to the maximum likelihood estimate, then the following stats used in frequentist statistics will be useful diagnostics.

I Good: The Likelihood Ratio

Ratio=−2[log L(θRestricted M odel |y)−log L(θF ull M odel |y)]

I This statistic will always favor the unrestricted model, but when the Bayes estimators or equivalent to the maximum likelihood estimates, then the Ratio is distributed as a χ2 where the number of degrees of freedom is equal to the number of test parameters.

(42)

Alternatives to the Bayes Factor for model assessmen

I Let θ denote your estimates of the parameter means (or medians or modes) in your model and suppose that the Bayes estimate is approximately equal to the maximum likelihood estimate, then the following stats used in frequentist statistics will be useful diagnostics.

I Good: The Likelihood Ratio

Ratio=−2[log L(θRestricted M odel |y)−log L(θF ull M odel |y)]

I This statistic will always favor the unrestricted model, but when the Bayes estimators or equivalent to the maximum likelihood estimates, then the Ratio is distributed as a χ2 where the number of degrees of freedom is equal to the number of test parameters.

12 / 17

(43)

Alternatives to the Bayes Factor for model assessmen

I Let θ denote your estimates of the parameter means (or medians or modes) in your model and suppose that the Bayes estimate is approximately equal to the maximum likelihood estimate, then the following stats used in frequentist statistics will be useful diagnostics.

I Better: Akaike Information Criterion (AIC) AIC =−2 log L(θ|y) + 2p

I wherep = the number of parameters including the intercept.

I To compare two models, compare the AIC from model 1 against the AIC from model 2.

(44)

Alternatives to the Bayes Factor for model assessmen

I Let θ denote your estimates of the parameter means (or medians or modes) in your model and suppose that the Bayes estimate is approximately equal to the maximum likelihood estimate, then the following stats used in frequentist statistics will be useful diagnostics.

I Better: Akaike Information Criterion (AIC) AIC =−2 log L(θ|y) + 2p

I wherep = the number of parameters including the intercept.

I To compare two models, compare the AIC from model 1 against the AIC from model 2.

13 / 17

(45)

Alternatives to the Bayes Factor for model assessmen

I Let θ denote your estimates of the parameter means (or medians or modes) in your model and suppose that the Bayes estimate is approximately equal to the maximum likelihood estimate, then the following stats used in frequentist statistics will be useful diagnostics.

I Better: Akaike Information Criterion (AIC) AIC =−2 log L(θ|y) + 2p

I wherep = the number of parameters including the intercept.

I To compare two models, compare the AIC from model 1 against the AIC from model 2.

(46)

Alternatives to the Bayes Factor for model assessmen

I Let θ denote your estimates of the parameter means (or medians or modes) in your model and suppose that the Bayes estimate is approximately equal to the maximum likelihood estimate, then the following stats used in frequentist statistics will be useful diagnostics.

I Better: Akaike Information Criterion (AIC) AIC =−2 log L(θ|y) + 2p

I wherep = the number of parameters including the intercept.

I To compare two models, compare the AIC from model 1 against the AIC from model 2.

13 / 17

(47)

Alternatives to the Bayes Factor for model assessmen

I Models do not need to be nested

I The AIC tends to be biased in favor of more complicated models, because the log-likelihood tends to increase faster than the number of parameters.

I Bayesian Information Criterion (BIC):

BIC =−2 log L(θ|y) +p×log(n)

wherep is the number of parameters and nis the sample size.

I This statistic can also be used for non-nested models

I BIC1−BIC2 ≈ −2log(Bayes F actor12) for Model 1 vs Model 2

(48)

Alternatives to the Bayes Factor for model assessmen

I Models do not need to be nested

I The AIC tends to be biased in favor of more complicated models, because the log-likelihood tends to increase faster than the number of parameters.

I Bayesian Information Criterion (BIC):

BIC =−2 log L(θ|y) +p×log(n)

wherep is the number of parameters and nis the sample size.

I This statistic can also be used for non-nested models

I BIC1−BIC2 ≈ −2log(Bayes F actor12) for Model 1 vs Model 2

14 / 17

(49)

Alternatives to the Bayes Factor for model assessmen

I Models do not need to be nested

I The AIC tends to be biased in favor of more complicated models, because the log-likelihood tends to increase faster than the number of parameters.

I Bayesian Information Criterion (BIC):

BIC =−2 log L(θ|y) +p×log(n)

wherep is the number of parameters and nis the sample size.

I This statistic can also be used for non-nested models

I BIC1−BIC2 ≈ −2log(Bayes F actor12) for Model 1 vs Model 2

(50)

Alternatives to the Bayes Factor for model assessmen

I Models do not need to be nested

I The AIC tends to be biased in favor of more complicated models, because the log-likelihood tends to increase faster than the number of parameters.

I Bayesian Information Criterion (BIC):

BIC =−2 log L(θ|y) +p×log(n)

wherep is the number of parameters and nis the sample size.

I This statistic can also be used for non-nested models

I BIC1−BIC2 ≈ −2log(Bayes F actor12) for Model 1 vs Model 2

14 / 17

(51)

Alternatives to the Bayes Factor for model assessmen

I Models do not need to be nested

I The AIC tends to be biased in favor of more complicated models, because the log-likelihood tends to increase faster than the number of parameters.

I Bayesian Information Criterion (BIC):

BIC =−2 log L(θ|y) +p×log(n)

wherep is the number of parameters and nis the sample size.

I This statistic can also be used for non-nested models

I BIC1−BIC2 ≈ −2log(Bayes F actor12) for Model 1 vs Model 2

(52)

Application

I Consider ‘birthwt’ dataset available in MASS package of R

I The dataset tries to look for the risk factors associated with low infant birth weight.

lowi =

1 indicator of birth weight less than 2.5 kg.

0 otherwise

i= 1,2...,189

zi0 + β1Agei2I(racei =black)

+ β3I(racei=others) +β4I(Smokei =yes) +i

i ∼N(0,1)and

P(low= 1|Age, race, smoke) =P(z >0|Age, race, smoke)

15 / 17

(53)

Application

I Consider ‘birthwt’ dataset available in MASS package of R

I The dataset tries to look for the risk factors associated with low infant birth weight.

lowi =

1 indicator of birth weight less than 2.5 kg.

0 otherwise

i= 1,2...,189

zi0 + β1Agei2I(racei =black)

+ β3I(racei=others) +β4I(Smokei =yes) +i

i ∼N(0,1)and

P(low= 1|Age, race, smoke) =P(z >0|Age, race, smoke)

(54)

Application

I Consider ‘birthwt’ dataset available in MASS package of R

I The dataset tries to look for the risk factors associated with low infant birth weight.

lowi =

1 indicator of birth weight less than 2.5 kg.

0 otherwise

i= 1,2...,189

zi0 + β1Agei2I(racei =black)

+ β3I(racei=others) +β4I(Smokei =yes) +i

i ∼N(0,1)and

P(low= 1|Age, race, smoke) =P(z >0|Age, race, smoke)

15 / 17

(55)

Application

I Consider ‘birthwt’ dataset available in MASS package of R

I The dataset tries to look for the risk factors associated with low infant birth weight.

lowi =

1 indicator of birth weight less than 2.5 kg.

0 otherwise

i= 1,2...,189

zi0 + β1Agei2I(racei =black)

+ β3I(racei=others) +β4I(Smokei =yes) +i

i ∼N(0,1)and

P(low= 1|Age, race, smoke) =P(z >0|Age, race, smoke)

(56)

Application

Posterior Summary

> library(MCMCpack)

> set.seed(8135)

> data(birthwt)

> M1 <- MCMCprobit(low~as.factor(race)+age+smoke

+ , data=birthwt, b0 = 0, B0 = 10

+ ,marginal.likelihood="Chib95")

> M2 <- MCMCprobit(low~as.factor(race) +smoke

+ , data=birthwt, b0 = 0, B0 = 10

+ ,marginal.likelihood="Chib95")

> M3 <- MCMCprobit(low~as.factor(race) +age

+ , data=birthwt, b0 = 0 , B0 = 10

+ ,marginal.likelihood="Chib95")

16 / 17

(57)

Application

Posterior Summary

> BF <- BayesFactor(M1, M2, M3)

> round(BF$BF.mat,digit=3)

M1 M2 M3

M1 1.000 1.445 6.807 M2 0.692 1.000 4.711 M3 0.147 0.212 1.000

I BF1,2 = 1.445indicates that data occurred1.41 times more likely under Model 1 (M1) than Model 2 (M2). It can be considered as an anecdotal evidence

I BF1,3 = 6.807indicates that data occurred6.81 times more likely under Model 1 (M1) than Model 3 (M3). It can be considered as moderate evidence

I BF2,3 = 4.711indicates that data occurred4.71 times more likely under Model 2 (M2) than Model 3 (M3).

(58)

Application

Posterior Summary

> BF <- BayesFactor(M1, M2, M3)

> round(BF$BF.mat,digit=3)

M1 M2 M3

M1 1.000 1.445 6.807 M2 0.692 1.000 4.711 M3 0.147 0.212 1.000

I BF1,2 = 1.445indicates that data occurred1.41 times more likely under Model 1 (M1) than Model 2 (M2). It can be considered as an anecdotal evidence

I BF1,3 = 6.807indicates that data occurred6.81 times more likely under Model 1 (M1) than Model 3 (M3). It can be considered as moderate evidence

I BF2,3 = 4.711indicates that data occurred4.71 times more likely under Model 2 (M2) than Model 3 (M3).

17 / 17

(59)

Application

Posterior Summary

> BF <- BayesFactor(M1, M2, M3)

> round(BF$BF.mat,digit=3)

M1 M2 M3

M1 1.000 1.445 6.807 M2 0.692 1.000 4.711 M3 0.147 0.212 1.000

I BF1,2 = 1.445indicates that data occurred1.41 times more likely under Model 1 (M1) than Model 2 (M2). It can be considered as an anecdotal evidence

I BF1,3 = 6.807indicates that data occurred6.81 times more likely under Model 1 (M1) than Model 3 (M3). It can be considered as moderate evidence

I BF2,3 = 4.711indicates that data occurred4.71 times more likely under Model 2 (M2) than Model 3 (M3).

References

Related documents

Suppose the statement is true for number of ‘→’ less than or equal to k.... Induction

Providing cer- tainty that avoided deforestation credits will be recognized in future climate change mitigation policy will encourage the development of a pre-2012 market in

Additionally, because product merchandising refers to both in-store and digital, it includes all promotional activities that take place in a store (such as shelf

The Congo has ratified CITES and other international conventions relevant to shark conservation and management, notably the Convention on the Conservation of Migratory

Corporations such as Coca Cola (through its Replenish Africa Initiative, RAIN, Reckitt Benckiser Group and Procter and Gamble have signalled their willingness to commit

Estimate of constant vector δ and parameter matrices Φ 1 and Φ 2 with estimates of standard errors of elements in parenthesis for the VAR(2) model fitted for

INDEPENDENT MONITORING BOARD | RECOMMENDED ACTION.. Rationale: Repeatedly, in field surveys, from front-line polio workers, and in meeting after meeting, it has become clear that

A random sample of 8 mango trees reveals the following number of fruits they yield.. of errors