📘 Session 7.1 — Point Estimation (SAT Example)
🎯 What are we actually trying to estimate?
Target: the true average SAT for next year’s freshmen, call it \( \mu_{\text{next}} \). We don’t know it yet because those students aren’t here.
So what do we do right now?
- Use the data we do have (recent cohorts, early admits) to build a point estimate — a single best guess.
- Make that guess honest by adding an uncertainty band (a 90–95% confidence interval).
Key idea: The sample average \( \bar{x} \) is our point estimate for the population mean \( \mu \).
Its typical error is the standard error \( \text{SE}(\bar{x}) \approx s/\sqrt{n} \).
Bigger \( n \Rightarrow \) smaller error.
A small SAT example (numbers all made-up)
We pull a quick sample of \( n = 40 \) reported SAT scores from admitted students. The sample stats are:
- Sample mean: \( \bar{x} = 1120 \)
- Sample st. dev.: \( s = 95 \)
- SE: \( s/\sqrt{n} = 95/\sqrt{40} \approx 15.0 \)
95% CI (back-of-the-envelope):
\( \bar{x} \pm 1.96\cdot \text{SE} = 1120 \pm 1.96\cdot 15 \approx [1091,\;1149] \).
What goes in the recruiting doc?
- Stationary (no big changes) version:
“Projected average SAT for incoming freshmen: ~1120 (95% CI 1091–1149), based on a recent sample.”
- Multi-year, steadier version:
“Three-year average SAT: 1118 (95% CI 1112–1124), weighted by class size.”
Bias vs variance in one sentence: More students lowers random error (variance) but does not fix a skewed sample (bias).
If test-optional means many students don’t report SAT, say that out loud and, if possible, adjust or stratify.
Super-short checklist
- Say the target (\( \mu_{\text{next}} \) or multi-year \( \mu \)).
- Compute \( \bar{x} \), \( s \), \( n \). (Excel:
=AVERAGE, =STDEV.S, =COUNT)
- Uncertainty: \( \text{SE} \approx s/\sqrt{n} \); CI with
=T.INV.2T(0.05,n-1) if \( n \) is small; otherwise \( z\approx 1.96 \) is fine.
- Write one sentence + CI + a note on who’s included (test-optional, admits vs. enrolled).
🔍 Point Estimation — without the jargon
Point estimation uses a single number (a statistic) calculated from a sample to estimate an unknown population parameter.
Formally, if you take a random sample \( X_1, X_2, \ldots, X_n \) from a population with unknown mean \( \mu \), a point estimator is any statistic
\( \widehat{\Theta} = h(X_1, X_2, \ldots, X_n) \).
The most common for the mean is the sample average:
\[
\bar{x} \;=\; \frac{1}{n}\sum_{i=1}^n X_i
\]
Once you compute it on your data, the number you get is the point estimate \( \hat{\mu} = \bar{x} \).
📐 Population vs. Sample — Key Equations
Population Mean
\[
\mu \;=\; \frac{\sum_{i=1}^{N} X_i}{N}
\]
Sample Mean
\[
\bar{x} \;=\; \frac{\sum_{i=1}^{n} x_i}{n}
\]
Population Variance
\[
\sigma^2 \;=\; \frac{\sum_{i=1}^{N} (X_i - \mu)^2}{N}
\]
Sample Variance
\[
s^2 \;=\; \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{\,n - 1\,}
\]
For SAT, our estimator for \( \mu \) is \( \bar{x} \).
A useful uncertainty measure is the standard error:
\( \text{SE}(\bar{x}) \approx s/\sqrt{n} \).