Subject: Statistics
Paper: Advanced Data Analysis
Module: Some Applications in Bootstrap-2
Development Team
Principal investigator: Prof. Bhaswati Ganguli,
Department of Statistics, University of CalcuttaPaper co-ordinator: Prof. Kalyan Das,
Department of Statistics, University of CalcuttaContent writer: Souvik Kumar Bandyopadhyay,
Indian Institute of Public Health, Hyderabad, Public Health Foundation of IndiaContent reviewer: Prof. Bhaswati Ganguli,
Department of Statistics, University of CalcuttaObjective
Objective
I
Bootstrap of mean
I
Bootstrap of median
Some Applications of Bootstrap
Recap
I
Bias for estimator θ ˆ 0f θ is b = E(ˆ θ − θ)
I
For a Univariate normal distribution
I The MLE ofσ2is Sn= Σni=1(Xi−X¯)2/n, whereX¯ is the sample mean
I Biasb=−σ2/n
I Bootstrap estimate of biasb isB∗ where
B∗=E(θ∗−θ) =ˆ E(θ∗−Sn);θ∗= Σni=1(Xi∗−X¯)2/n
I The monte carlo approximation toB∗ isBM onte= ΣBj∗/N whereB∗j =θj∗−θˆandθ∗j is the sample variance of thejth bootstrap sample
Ref: Bootstrap methods with application to R by Michael
R.Chernik and Robert A. LaBudde Chapter-2
Some Applications of Bootstrap
#Bootstrap methods with application to R by Michael R.Chernik
# and Robert A. LaBudde Chapter-2#
set.seed(5^13) #set random seed for reproduciblity n<- 25
x<- rnorm(n)
varx<- var(x)*(n-1)/n # sample variance, uncorrected
#sample variance and bias relative to true value of 1.0
# and expected value of bias c(varx, varx - 1.0, -1/sqrt(n))
## [1] 0.8665871 -0.1334129 -0.2000000
Some Applications of Bootstrap
B<- 5000 #number of bootstrap resamples
bvarx<- NULL #initialize resample variances vector for (i in 1:B) { #for each resample
xstar<- sample(x, n, replace=TRUE) #generate resample of size n from data bvarx[i]<- var(xstar)*(n-1)/n #resample variance, uncorrected
}
thetastar<- mean(bvarx) #estimate of variance
#resample variance estimate and bias estimate c(thetastar, thetastar - varx)
## [1] 0.82884859 -0.03773855
Some Applications of Bootstrap: Jackknife
require('bootstrap')
## Loading required package: bootstrap
theta<- function(x) var(x)*(n-1)/n #uncorrected variance jack<- jackknife(x, theta)
names(jack)
## [1] "jack.se" "jack.bias" "jack.values" "call"
jvarx<- mean(jack$jack.values) jvarx
## [1] 0.8665871
Some Applications of Bootstrap: A problem
Bootstrap methods with application to R: Problem-1, Chapeter-2
Airline accidents: According to the U.S. National Transportation Safety Board, the number of airline accidents by year from 1983 to 2006 were 23, 16, 21, 24, 34, 30, 28, 24, 26, 18, 23, 23, 36, 37, 49, 50, 51, 56, 46, 41, 54, 30, 40, and 31. For the sample data, compute the mean and its standard error and the median.
Compute the same bootstrap estimates
Some Applications of Bootstrap: Solution
set.seed(1)
x<- c(23, 16, 21, 24, 34, 30, 28, 24, 26, 18, 23, 23, 36, 37, 49, 50, 51, 56, 46, 41, 54, 30, 40, 31)
n<- length(x) B<- 1000 xMean<- NULL xMed<- NULL xSEmean<- NULL xSEmed<- NULL
Some Applications of Bootstrap: Solution
for (i in 1:B) { #for each bootstrap resample xx<- sample(x, n, replace=TRUE) #resample
xMean[i]<- mean(xx) #keep track of mean estimates xMed[i]<- median(xx) #keep track of median estimates }
#show mean of data and calculated standard error c(mean(x), sd(x)/sqrt(n))
## [1] 33.791667 2.462751
#show mean estimate and standard error c(mean(xMean), sd(xMean))
## [1] 33.690417 2.472755
Some Applications of Bootstrap: Solution
median(x) #show median of data
## [1] 30.5
c(mean(xMed), sd(xMed)) #show median estimate and standard error
## [1] 31.421500 3.880741
median(xMed) #show median of medians
## [1] 30.5