📊 Session 7.1 — Point Estimation (SAT Example)

🎯 What are we actually trying to estimate?

Target: the true average SAT for next year’s freshmen, call it \( \mu_{\text{next}} \). We don’t know it yet because those students aren’t here.

So what do we do right now?

Use the data we do have (recent cohorts, early admits) to build a point estimate — a single best guess.
Make that guess honest by adding an uncertainty band (a 90–95% confidence interval).

Key idea: The sample average \( \bar{x} \) is our point estimate for the population mean \( \mu \). Its typical error is the standard error \( \text{SE}(\bar{x}) \approx s/\sqrt{n} \). Bigger \( n \Rightarrow \) smaller error.

A small SAT example (numbers all made-up)

We pull a quick sample of \( n = 40 \) reported SAT scores from admitted students. The sample stats are:

Sample mean: \( \bar{x} = 1120 \)
Sample st. dev.: \( s = 95 \)
SE: \( s/\sqrt{n} = 95/\sqrt{40} \approx 15.0 \)

95% CI (back-of-the-envelope): \( \bar{x} \pm 1.96\cdot \text{SE} = 1120 \pm 1.96\cdot 15 \approx [1091,\;1149] \).

What goes in the recruiting doc?

Stationary (no big changes) version: “Projected average SAT for incoming freshmen: ~1120 (95% CI 1091–1149), based on a recent sample.”
Multi-year, steadier version: “Three-year average SAT: 1118 (95% CI 1112–1124), weighted by class size.”

Bias vs variance in one sentence: More students lowers random error (variance) but does not fix a skewed sample (bias). If test-optional means many students don’t report SAT, say that out loud and, if possible, adjust or stratify.

Super-short checklist

Say the target (\( \mu_{\text{next}} \) or multi-year \( \mu \)).
Compute \( \bar{x} \), \( s \), \( n \). (Excel: =AVERAGE, =STDEV.S, =COUNT)
Uncertainty: \( \text{SE} \approx s/\sqrt{n} \); CI with =T.INV.2T(0.05,n-1) if \( n \) is small; otherwise \( z\approx 1.96 \) is fine.
Write one sentence + CI + a note on who’s included (test-optional, admits vs. enrolled).

🔍 Point Estimation — without the jargon

Point estimation uses a single number (a statistic) calculated from a sample to estimate an unknown population parameter.

Formally, if you take a random sample \( X_1, X_2, \ldots, X_n \) from a population with unknown mean \( \mu \), a point estimator is any statistic \( \widehat{\Theta} = h(X_1, X_2, \ldots, X_n) \). The most common for the mean is the sample average:

\[ \bar{x} \;=\; \frac{1}{n}\sum_{i=1}^n X_i \]

Once you compute it on your data, the number you get is the point estimate \( \hat{\mu} = \bar{x} \).

📐 Population vs. Sample — Key Equations

Population Mean

\[ \mu \;=\; \frac{\sum_{i=1}^{N} X_i}{N} \]

Sample Mean

\[ \bar{x} \;=\; \frac{\sum_{i=1}^{n} x_i}{n} \]

Population Variance

\[ \sigma^2 \;=\; \frac{\sum_{i=1}^{N} (X_i - \mu)^2}{N} \]

Sample Variance

\[ s^2 \;=\; \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{\,n - 1\,} \]

For SAT, our estimator for \( \mu \) is \( \bar{x} \). A useful uncertainty measure is the standard error: \( \text{SE}(\bar{x}) \approx s/\sqrt{n} \).

📘 Session 7.1 — Point Estimation (SAT Example)

🎯 What are we actually trying to estimate?

So what do we do right now?

A small SAT example (numbers all made-up)

What goes in the recruiting doc?

Super-short checklist

🔍 Point Estimation — without the jargon

📐 Population vs. Sample — Key Equations