Session 7.3.4 Bootstrap Standard Error
Pick a statistic (mean / median / SD), enter data, then run a nonparametric or parametric bootstrap. Default nB=200.
Bootstrap — General Procedure (plain English)
- Define the question. What single number do you care about? (mean, median, or SD).
- Collect a sample. Small is OK (e.g., n=10). Enter it in the grid. Units matter (minutes, mm, etc.).
- Check bias risks. Is your sample representative? Any coverage/nonresponse/time-of-day bias? SE measures wobble (precision), not bias.
- Choose mode. Nonparametric (resample your data) or Parametric (draw from a model, e.g., Normal with μ,σ).
- Pick replicates. Start with nB = 200 for class; use 1000+ for reports.
- Resample & compute. For each bootstrap sample, compute the chosen statistic → a distribution of replicates.
- Summarize uncertainty. Bootstrap SE = sd of replicates; 95% CI via percentiles or \(\pm 2\times \text{SE}\).
- Interpret. Low SE = precise, not automatically “good data.” Inspect histogram shape; consider median if outliers/skew.
- Report. Example: “Mean commute = 11.7 min (SE 0.8; 95% CI 10.1–13.2; n=10; nonparametric bootstrap).”
      Student commute case (minutes; n=10): Click Load student one way commute time example, choose Mean and Nonparametric, run. Then switch to Median and compare SE & CI. Ask: could survey timing or who responded bias the estimate?
    
    
      SE ≠ “good data” — SE tells you precision (wobble). It does not fix sampling bias, measurement error, or a bad model.
    
  When to use the bootstrap (small-n intuition):
  You have a small sample and want to know how your statistic (mean/median/SD)
  would vary if you could repeat the study many times. The bootstrap emulates
  those extra samples — by resampling your data (nonparametric) or a model
  like Normal(μ,σ) (parametric) — to estimate SE and CI.
  It does not add new information or fix sampling bias.
  
        
        
        SD → we will resample your data and plot a histogram of bootstrap SDs (spread often right-skewed).
      
      
      Bootstrap Mode
Excel — Clear Step-by-Step (one sheet)
- Original data in A2:A11 (n = 10). Do not overwrite.
- Set μ and σ (same sheet):  
              - B1 = =AVERAGE($A$2:$A$11) ← μ
- B2 = =STDEV.S($A$2:$A$11) ← σ
 
- Nonparametric bootstrap (resample from A2:A11):
              - In B3: =INDEX($A$2:$A$11, RANDBETWEEN(1,10))
- Fill B3 → B12 (Resample #1), then copy that 10×1 block across to 200 columns (B…GR).
 
- Parametric bootstrap (Normal with μ, σ):
              - In B3: =NORM.INV(RAND(), $B$1, $B$2)
- Fill B3 → B12 (Resample #1), then copy across to 200 columns (B…GR).
 
- Statistic for each resample (row 14):
              - SD: in B14 → =STDEV.S(B3:B12), then fill across to GR14
- Mean: =AVERAGE(B3:B12) · Median: =MEDIAN(B3:B12)
 
- Bootstrap SE (spread across resamples): anywhere (e.g., B16) → =STDEV.S(B14:GR14)
- One-shot spill (no dragging) — start in B3 and press Enter; Excel will spill a 10×200 block into B3:GR12.
              - Nonparametric (from A2:A11)
 =INDEX($A$2:$A$11, RANDARRAY(ROWS($A$2:$A$11), 200, 1, ROWS($A$2:$A$11), TRUE))
 Uses the sample size automatically via ROWS($A$2:$A$11). Change 200 to your desired nB.
- Parametric (Normal with μ, σ in B1, B2)
 =NORM.INV(RANDARRAY(ROWS($A$2:$A$11), 200), $B$1, $B$2)
 Also spills 10×200 using the same n and nB. Make sure B3:GR12 is empty before entering.
 
- Nonparametric (from A2:A11)
- Press F9 to refresh; to freeze a run, copy B3:GR12 → Paste Special → Values.
Results
Original statistic
—
          —
        Bootstrap SE (SE_B)
—
          SE across the \( n_B \) bootstrap estimates
        Bootstrap mean
—
          Mean of bootstrap estimates
        Reference SE (formulas)
—
          Mean: s/√n · SD: s/√[2(n−1)] (Normal-only) · Median: N/A
        “±2 SE” band
—
          Rough 95% band around original estimate
        Bootstrap distribution
        Teacher Summary (auto)
        
      Run the bootstrap to generate a plain-English summary here.
        
      
        Reference check (SE formulas)
        
    —
      What is the bootstrap? (plain English)
- It’s a computer shortcut to measure how much your statistic (mean/median/SD) would wobble if you repeated the study many times.
- Nonparametric: treat your sample like a tiny “population,” draw with replacement, recompute the statistic.
- Parametric: if you assume a model (e.g., Normal with μ, σ), draw from that model and recompute.
- The spread of those repeated statistics is the bootstrap standard error.
Why use it?
- Small sample & can’t easily collect more: emulate many “new” samples to quantify precision (SE, CI).
- Formulas are hard/unknown (e.g., SE of the SD with small n).
- n is small and normal-approximation is shaky.
- Quick, general SE or rough 95% band.
Common uses
- SE/CI for mean, median, SD.
- Uncertainty for regression coefficients.
- Model performance uncertainty via resampling rows.
Compare to other methods (no math)
- Normal / t-based formulas: Great when assumptions hold (mean’s s/√n). Bootstrap is more flexible when not.
- Jackknife: Fast leave-one-out; good for smooth stats; can be biased for medians/SDs.
- Permutation tests: For hypothesis tests (shuffle labels). Different goal.