Session 7.3.4 Bootstrap Standard Error

Pick a statistic (mean / median / SD), enter data, then run a nonparametric or parametric bootstrap. Default n_B=200.

Bootstrap — General Procedure (plain English)

Define the question. What single number do you care about? (mean, median, or SD).
Collect a sample. Small is OK (e.g., n=10). Enter it in the grid. Units matter (minutes, mm, etc.).
Check bias risks. Is your sample representative? Any coverage/nonresponse/time-of-day bias? SE measures wobble (precision), not bias.
Choose mode. Nonparametric (resample your data) or Parametric (draw from a model, e.g., Normal with μ,σ).
Pick replicates. Start with nB = 200 for class; use 1000+ for reports.
Resample & compute. For each bootstrap sample, compute the chosen statistic → a distribution of replicates.
Summarize uncertainty. Bootstrap SE = sd of replicates; 95% CI via percentiles or $\pm 2\times \text{SE}$.
Interpret. Low SE = precise, not automatically “good data.” Inspect histogram shape; consider median if outliers/skew.
Report. Example: “Mean commute = 11.7 min (SE 0.8; 95% CI 10.1–13.2; n=10; nonparametric bootstrap).”

Student commute case (minutes; n=10): Click Load student one way commute time example, choose Mean and Nonparametric, run. Then switch to Median and compare SE & CI. Ask: could survey timing or who responded bias the estimate?

SE ≠ “good data” — SE tells you precision (wobble). It does not fix sampling bias, measurement error, or a bad model.

When to use the bootstrap (small-n intuition): You have a small sample and want to know how your statistic (mean/median/SD) would vary if you could repeat the study many times. The bootstrap emulates those extra samples — by resampling your data (nonparametric) or a model like Normal(μ,σ) (parametric) — to estimate SE and CI. It does not add new information or fix sampling bias.

1 Statistic SD → we will resample your data and plot a histogram of bootstrap SDs (spread often right-skewed).

2 Replicates n_B 3 Seed (optional)

Data size n

Bootstrap Mode

Nonparametric (resample from your sample)
Parametric Normal $ \mathcal{N}(\mu,\sigma) $

Excel — Clear Step-by-Step (one sheet)

Original data in A2:A11 (n = 10). Do not overwrite.
Set μ and σ (same sheet):
- B1 = =AVERAGE($A$2:$A$11) ← μ
- B2 = =STDEV.S($A$2:$A$11) ← σ
Nonparametric bootstrap (resample from A2:A11):
1. In B3: =INDEX($A$2:$A$11, RANDBETWEEN(1,10))
2. Fill B3 → B12 (Resample #1), then copy that 10×1 block across to 200 columns (B…GR).
Parametric bootstrap (Normal with μ, σ):
1. In B3: =NORM.INV(RAND(), $B$1, $B$2)
2. Fill B3 → B12 (Resample #1), then copy across to 200 columns (B…GR).
Statistic for each resample (row 14):
- SD: in B14 → =STDEV.S(B3:B12), then fill across to GR14
- Mean: =AVERAGE(B3:B12) · Median: =MEDIAN(B3:B12)
Bootstrap SE (spread across resamples): anywhere (e.g., B16) → =STDEV.S(B14:GR14)
One-shot spill (no dragging) — start in B3 and press Enter; Excel will spill a 10×200 block into B3:GR12.
- Nonparametric (from A2:A11)
  =INDEX($A$2:$A$11, RANDARRAY(ROWS($A$2:$A$11), 200, 1, ROWS($A$2:$A$11), TRUE))
  Uses the sample size automatically via ROWS($A$2:$A$11). Change 200 to your desired n_B.
- Parametric (Normal with μ, σ in B1, B2)
  =NORM.INV(RANDARRAY(ROWS($A$2:$A$11), 200), $B$1, $B$2)
  Also spills 10×200 using the same n and n_B. Make sure B3:GR12 is empty before entering.
Press F9 to refresh; to freeze a run, copy B3:GR12 → Paste Special → Values.

Results

Original statistic

—

Bootstrap SE (SE_B)

—

SE across the $ n_B $ bootstrap estimates

Bootstrap mean

—

Mean of bootstrap estimates

Reference SE (formulas)

—

Mean: s/√n · SD: s/√[2(n−1)] (Normal-only) · Median: N/A

“±2 SE” band

—

Rough 95% band around original estimate

Bootstrap distribution

Teacher Summary (auto)

Run the bootstrap to generate a plain-English summary here.

Reference check (SE formulas)

—

What is the bootstrap? (plain English)

It’s a computer shortcut to measure how much your statistic (mean/median/SD) would wobble if you repeated the study many times.
Nonparametric: treat your sample like a tiny “population,” draw with replacement, recompute the statistic.
Parametric: if you assume a model (e.g., Normal with μ, σ), draw from that model and recompute.
The spread of those repeated statistics is the bootstrap standard error.

Why use it?

Small sample & can’t easily collect more: emulate many “new” samples to quantify precision (SE, CI).
Formulas are hard/unknown (e.g., SE of the SD with small n).
n is small and normal-approximation is shaky.
Quick, general SE or rough 95% band.

Common uses

SE/CI for mean, median, SD.
Uncertainty for regression coefficients.
Model performance uncertainty via resampling rows.

Compare to other methods (no math)

Normal / t-based formulas: Great when assumptions hold (mean’s s/√n). Bootstrap is more flexible when not.
Jackknife: Fast leave-one-out; good for smooth stats; can be biased for medians/SDs.
Permutation tests: For hypothesis tests (shuffle labels). Different goal.