Session 7.4.2 — MLE used in OLS

Goal: Show that the usual straight-line fit from OLS is also the maximum likelihood line when errors are roughly Normal. Simple steps, one table, Excel-ready.

1) The idea in plain English

Two different “Y minus …”: Y−Ȳ is used to build the slope (before we know the line). Y−Ŷ is the residual after we have the line, used to judge fit (SSE).

2) Enter data: Study hours (X) → Exam score (Y)

#X = Study (hrs)Y = Score (0–100)

3) Results & meanings

β₁ (slope)
β₀ (intercept)
Closer to 1 ⇒ study hours explain more score variation.
SSE
OLS picks β₀, β₁ to make this as small as possible.

4) The table (build the line → then judge the line)

X̄ = —, Ȳ = —
# X Y X−X̄
use to build slope
Y−Ȳ
use to build slope
(X−X̄)(Y−Ȳ)
sum → numerator
(X−X̄)²
sum → denominator
Ŷ = β₀ + β₁X
line prediction
Residual = Y−Ŷ
judge fit
Residual²
sum → SSE
Totals used to compute slope/intercept SSE = —
How we get β₁ and β₀ (no calculus shown):
β̂₁ = Σ(X−X̄)(Y−Ȳ) / Σ(X−X̄)²
β̂₀ = Ȳ − β̂₁·X̄
We’ll plug your totals here after you press “Compute”.
Why this is MLE: If errors around the line are roughly Normal with the same spread, the line that maximizes likelihood is exactly the one that minimizes SSE. That’s why OLS estimates (β̂₀, β̂₁) are also MLEs.

5) Excel: use Solver to “find β” by minimizing SSE (Sum of Squared Errors)

  1. Put X in A2:A?, Y in B2:B?.
  2. Pick two cells for the unknowns, say F2 = β₀, F3 = β₁. Put any starting guesses (e.g., 50 and 5).
  3. In C2 type: =$F$2 + $F$3*A2 → fill down (this is Ŷ).
  4. In D2 type: =B2 - C2 → fill down (residual).
  5. In a cell for SSE (e.g., F5): =SUMSQ(D2:D?).
  6. Data → Solver: Set Objective = F5, To = Min, By Changing Cells = F2:F3, Method = GRG NonlinearSolve.

Solver will return the same β̂₀, β̂₁ you got above, because minimizing SSE = maximizing likelihood under Normal errors.

6) Visual — Scatter, OLS line, and residual segments

After you press Compute, the scatter updates, the blue line is Ŷ = β̂₀ + β̂₁X, and each orange segment is a residual (Y−Ŷ). OLS chooses β̂ to make these orange segments, squared and summed, as small as possible.