📢 Student Q&A — Quick, plain-English answers (20)
  Q1. What does the “mean” tell me in real life?
    It’s the typical level. If quiz scores are 70, 80, 90, the mean is 80—roughly what you expect for a new student. In engineering, mean battery life (\(\bar x\)) is the average runtime you’d plan around.
  Q2. Why does the average get less noisy with bigger samples?
    Because \(\mathrm{SD}(\bar X)=\sigma/\sqrt{n}\). Doubling \(n\) divides the noise by \(\sqrt{2}\). Think of averaging many noisy sensor readings—the average is steadier.
  Q3. What does the CLT actually do for me?
    Even if the original data aren’t Normal, the mean \(\bar X\) is approximately Normal for moderate/large \(n\). That lets you use t/z tools for \(\bar X\). Example: commute times are skewed, but class-average commute time is close to Normal.
  Q4. When do I use a t-interval instead of z?
    Use t when population \(\sigma\) is unknown (typical) and you estimate it with \(s\). Excel: T.INV.2T(0.05, n-1) for 95%.
  Q5. How do I word a 95% CI correctly?
    “We are 95% confident the true mean lies between [lower, upper].” Don’t say “95% probability the mean is in the interval.”
  Q6. What is a bootstrap sample in plain English?
    It’s re-sampling your data with replacement, like drawing tickets from a hat and putting them back. Each resample has size \(n\), taken from your original \(n\).
  Q7. How many bootstrap resamples do I need?
    For class demos: 100 (fast). For more stable CI ends: 1,000+. Many use 5,000 when publishing.
  Q8. Percentile CI vs Normal-approx CI for bootstrap—what’s the difference?
    Percentile CI uses the 2.5th and 97.5th percentiles of the bootstrap estimates. Normal-approx uses mean ± 1.96×(bootstrap SE). If the bootstrap distribution is skewed, percentile CI is often more honest.
  Q9. In a line \(y=\beta_0+\beta_1 x\), what does \(\beta_1\) mean?
    Slope = expected change in \(y\) per 1 unit of \(x\). If \(x\)=study hours and \(y\)=score, \(\hat\beta_1=4.8\) means about +4.8 points per extra hour.
  Q10. What does the intercept \(\beta_0\) mean?
    It’s the predicted \(y\) when \(x=0\). Sometimes that point isn’t realistic (e.g., 0 hours of sleep), so interpret with care.
  Q11. What are SSE, MSE, and RMSE?
    SSE = \(\sum (y_i-\hat y_i)^2\). MSE = SSE/(n−k) (average squared error). RMSE = \(\sqrt{\text{MSE}}\) (error in original units). Excel: SSE=SUMSQ(residuals).
  Q12. Why is OLS also MLE when errors are Normal?
    Normal log-likelihood is \(\ell=-\frac{n}{2}\ln(2\pi\sigma^2)-\frac{1}{2\sigma^2}\sum (y-\beta_0-\beta_1x)^2\). For fixed \(\sigma^2\), maximizing \(\ell\) ⇔ minimizing SSE, so OLS = MLE.
  Q13. What if errors aren’t Normal—does OLS break?
    OLS still gives the least-squares line but may be sensitive to outliers/heavy tails. Alternatives: LAD (L1) regression, robust regression, or bootstrap CIs for more reliable inference.
  Q14. Why is a t-interval wider than a z-interval?
    Because we don’t know \(\sigma\) and estimate it with \(s\), adding extra uncertainty (especially for small \(n\)).
  Q15. What’s Welch’s t-interval and when use it?
    Two-sample mean difference when variances aren’t assumed equal. Use it for A/B tests with different spreads. It’s the default safe choice.
  Q16. What does “label-preserving” bootstrap mean for two samples?
    You resample within each group A and B separately (keep labels). This keeps group sizes and within-group structure the same in each resample.
  Q17. How do outliers affect my line fit?
    A single extreme \(x\) or \(y\) can swing the slope a lot. Check residual plots; consider transformations, winsorizing, or robust methods.
  Q18. Is high \(R^2\) always good?
    High \(R^2\) means the line explains more variation, but it doesn’t prove causation or guarantee good predictions outside your \(x\)-range (extrapolation risk!).
  Q19. What are AIC and BIC in one line each?
    They score fit with a penalty for model size \(k\). Lower is better. AIC \(=2k-2\ell(\hat\theta)\), BIC \(=k\ln n-2\ell(\hat\theta)\).
  Q20. How many samples do I need for a target margin of error?
    For mean with known \(\sigma\): \(n=\big(z^*\sigma/E\big)^2\). Example: want MOE \(E=2\) minutes for bus wait times, \(\sigma\approx 10\), 95% ⇒ \(z^*\approx1.96\): \(n\approx (1.96\cdot10/2)^2\approx96\).