The R package **boot** allows a user to easily generate bootstrap samples of virtually any statistic that they can calculate in R. From these samples, you can generate estimates of bias, bootstrap confidence intervals, or plots of your bootstrap replicates. We will demonstrate a few of these techniques in this page and you can read more details at its CRAN package page. Before using commands in the boot package,
you must first download the package and load it in your workspace. We will be using the hsb2 dataset for all of the examples on this page.

install.packages("boot",dep=TRUE) library(boot) hsb2 <- read.table("https://stats.idre.ucla.edu/stat/data/hsb2.csv", sep=",", header=T)

## Using the boot command

The **boot** command executes the resampling of your dataset and calculation of your statistic(s) of interest on these samples. Before calling **boot**, you need to define a function that will return the statistic(s) that you would like to bootstrap. The first argument passed to the function should be your dataset. The second argument can be an index vector of the observations in your dataset to use or a frequency or weight vector that informs the sampling probabilities. The example below uses the default index vector and assumes we wish to use all of our observations. The statistic of interest here is the correlation coefficient of **write** and **math**.

fc <- function(d, i){ d2 <- d[i,] return(cor(d2$write, d2$math)) }

With the function **fc** defined, we can use the **boot** command, providing our dataset name, our function, and the number of bootstrap samples to be drawn.

#turn off set.seed() if you want the results to vary set.seed(626) bootcorr <- boot(hsb2, fc, R=500) bootcorr

ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = hsb2, statistic = fc, R = 500) Bootstrap Statistics : original bias std. error t1* 0.6174493 -0.001528707 0.04020362

While the printed output for bootcorr is brief, R saves additional information that can be listed:

summary(bootcorr)

Length Class Mode t0 1 -none- numeric t 500 -none- numeric R 1 -none- numeric data 11 data.frame list seed 626 -none- numeric statistic 1 -none- function sim 1 -none- character call 4 -none- call stype 1 -none- character strata 200 -none- numeric weights 200 -none- numeric

Knowing the **seed** value would allow us to replicate this analysis, if needed, and from the **t** vector and** t0**, we could calculate the bias and standard error:

mean(bootcorr$t) - bootcorr$t0

[1] -0.001528707

sd(bootcorr$t)

[1] 0.04020362

For using other commands in the **boot** package, you will often need to provide a “boot” object:

class(bootcorr)[1] "boot"

## Bootstrap confidence intervals and plots

To look at a histogram and normal quantile-quantile plot of your bootstrap estimates, you can use **plot** with the “boot” object you created. The histogram includes a dotted vertical line indicating the location of the original statistic.

plot(bootcorr)

Using the **boot.ci** command, you can generate several types of confidence intervals from your bootstrap samples.

boot.ci(boot.out = bootcorr, type = c("norm", "basic", "perc", "bca"))BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 500 bootstrap replicates CALL : boot.ci(boot.out = bootcorr, type = c("norm", "basic", "perc", "bca")) Intervals : Level Normal Basic 95% ( 0.5402, 0.6978 ) ( 0.5406, 0.7063 ) Level Percentile BCa 95% ( 0.5286, 0.6943 ) ( 0.5291, 0.6946 ) Calculations and Intervals on Original Scale

Four 95% confidence intervals are presented: normal, basic, percentile, and bias-corrected and accelerated. A fifth type, the studentized intervals, requires variances from each bootstrap sample.