📊 Chapter 7: Sampling Distributions and Estimation

🎮 Apps 📽️ PPT 📝 Quiz 📚 Homework 🧮 Excel Quick Formulas 📢 Q&A

🎮 Interactive Apps (HTML/JS)

📽️ PPT Slides

Download Chapter 7 Slides

📝 Quiz

📚 Homework — Chapter 7 (Excel-Ready)

Complete in order. Use the exact datasets and formulas (mirrors the apps). Submit one PDF with tables, histograms, CI/PI endpoints, and short interpretations.

7.2 CLT (Normal source) — Use CLT_Pop_Normal.csv (N=5,000). 
• Do n = 5 and n = 30 with 1,000 reps of the sample mean 𝑿̄ₙ. 
• Compare the empirical SD(𝑿̄ₙ) to the theory σ/√n. 
• Include two histograms and a sentence on shape/center/spread.

7.3.4 Bootstrap (100 resamples) — Use Bootstrap_One_Normal.csv (n=30). 
• Create 100 bootstrap means, report Bootstrap SE. 
• Give a 95% percentile CI and a 95% normal-approx CI (mean ± 1.96×Bootstrap SE). 
• Include a histogram and one interpretation sentence.

7.4.2 MLE via Solver (OLS) — Use OLS_Practice_Normal.csv (31 pairs). 
• Fit ŷ = β₀ + β₁x by minimizing SSE with Solver. 
• Report β̂₀, β̂₁, SSE, and one line on why OLS = MLE with Normal errors.
  

Build steps & exact Excel formulas: chapter7-homework.html

Quick downloads:

⬇️ CLT_Pop_Normal.csv ⬇️ Bootstrap_One_Normal.csv ⬇️ OLS_Practice_Normal.csv
Don’t mix these up: CI half-width = t*·s/√n (parameter μ); PI half-width = t*·s·√(1+1/n) (next observation); TI width = k·s (cover a chosen % of the population).

🧮 Excel Quick Formulas (copy-ready)

Mean: =AVERAGE(B2:Bn)

Sample SD (use for s): =STDEV.S(B2:Bn) (legacy Excel: =STDEV() behaves like STDEV.S)

Population SD (treat whole file as population, e.g., CLT_Pop_Normal): =STDEV.P(B2:Bn) (this is σ)


t critical (two-tailed): =T.INV.2T(alpha, N-1)

CI (mean): half = t* * S / SQRT(N) → CI = XBAR ± half

PI (next value): half = t* * S * SQRT(1 + 1/N) → PI = XBAR ± half

TI (coverage γ): TI = XBAR ± k*S (k from a TI table/software for your γ, confidence, and n)

📢 Student Q&A — Quick, plain-English answers (20)

Q1. What does the “mean” tell me in real life?
It’s the typical level. If quiz scores are 70, 80, 90, the mean is 80—roughly what you expect for a new student. In engineering, mean battery life (\(\bar x\)) is the average runtime you’d plan around.
Q2. Why does the average get less noisy with bigger samples?
Because \(\mathrm{SD}(\bar X)=\sigma/\sqrt{n}\). Doubling \(n\) divides the noise by \(\sqrt{2}\).
Q3. What does the CLT actually do for me?
Even if the original data aren’t Normal, the mean \(\bar X\) is approximately Normal for moderate/large \(n\). That lets you use t/z tools for \(\bar X\).
Q4. When do I use a t-interval instead of z?
Use t when population \(\sigma\) is unknown and you estimate it with \(s\).
Q5. How do I word a 95% CI correctly?
“We are 95% confident the true mean lies between [lower, upper].” Don’t say “95% probability the mean is in the interval.”
Q6. What is a bootstrap sample?
Resampling your data with replacement. Each resample has size \(n\) drawn from your original \(n\).
Q7. How many bootstrap resamples do I need?
For class demos: 100. For stable CI ends: 1,000+. Papers often use 5,000.
Q8. Percentile CI vs Normal-approx CI?
Percentile uses the empirical 2.5th/97.5th percentiles of bootstrap estimates. Normal-approx uses mean ± 1.96×bootstrap SE.
Q9. What does slope \(\beta_1\) mean?
Expected change in \(y\) per +1 in \(x\).
Q10. Intercept \(\beta_0\)?
Predicted \(y\) when \(x=0\); may not be a realistic point.
Q11. SSE, MSE, RMSE?
SSE = \(\sum (y-\hat y)^2\). MSE = SSE/(n−k). RMSE = \(\sqrt{\text{MSE}}\).
Q12. Why OLS = MLE with Normal errors?
Maximizing Normal likelihood ⇔ minimizing SSE.
Q13. If errors aren’t Normal—does OLS break?
No, but inference can be fragile. Consider robust/bootstrapped alternatives.
Q14. Why is a t-interval wider than z-interval?
Estimating \(\sigma\) with \(s\) adds uncertainty, especially for small \(n\).
Q15. Welch’s t-interval?
Two-sample mean difference when variances aren’t assumed equal.
Q16. Label-preserving bootstrap?
Resample within each group separately, keeping labels.
Q17. Outliers and line fit?
A single extreme \(x\) or \(y\) can swing the slope. Check residuals.
Q18. Is high \(R^2\) always good?
No—doesn’t prove causation or safe extrapolation.
Q19. AIC/BIC?
Fit with penalty for size. Lower is better.
Q20. Sample size for target MOE?
Known \(\sigma\): \(n=(z^*\sigma/E)^2\).