Session 7.4.2 — MLE used in OLS
Goal: Show that the usual straight-line fit from OLS is also the maximum likelihood line when errors are roughly Normal. Simple steps, one table, Excel-ready.
1) The idea in plain English
- We want a rule: Score = β₀ + β₁·Study + noise.
- OLS picks the line that makes the total squared mistakes (SSE - Sum of Squared Errors; also called Residual Sum of Squares, RSS) as small as possible.
- MLE view: If the “noise” around the line is roughly Normal with the same spread, the line that makes your data the most likely is the same OLS line. So OLS = MLE.
Two different “Y minus …”: Y−Ȳ is used to build the slope (before we know the line). Y−Ŷ is the residual after we have the line, used to judge fit (SSE).
2) Enter data: Study hours (X) → Exam score (Y)
| # | X = Study (hrs) | Y = Score (0–100) | 
|---|
3) Results & meanings
β₁ (slope)
        —
        —
      β₀ (intercept)
        —
        —
      R²
        —
        Closer to 1 ⇒ study hours explain more score variation.
      SSE
        —
        OLS picks β₀, β₁ to make this as small as possible.
      4) The table (build the line → then judge the line)
X̄ = —, Ȳ = —
    | # | X | Y | X−X̄ use to build slope | Y−Ȳ use to build slope | (X−X̄)(Y−Ȳ) sum → numerator | (X−X̄)² sum → denominator | Ŷ = β₀ + β₁X line prediction | Residual = Y−Ŷ judge fit | Residual² sum → SSE | 
|---|---|---|---|---|---|---|---|---|---|
| Totals used to compute slope/intercept | — | — | SSE = — | ||||||
        How we get β₁ and β₀ (no calculus shown):
β̂₁ = Σ(X−X̄)(Y−Ȳ) / Σ(X−X̄)²
β̂₀ = Ȳ − β̂₁·X̄
We’ll plug your totals here after you press “Compute”.
      β̂₁ = Σ(X−X̄)(Y−Ȳ) / Σ(X−X̄)²
β̂₀ = Ȳ − β̂₁·X̄
We’ll plug your totals here after you press “Compute”.
        Why this is MLE: If errors around the line are roughly Normal with the same spread, the line that maximizes likelihood is exactly the one that minimizes SSE. That’s why OLS estimates (β̂₀, β̂₁) are also MLEs.
      
    5) Excel: use Solver to “find β” by minimizing SSE (Sum of Squared Errors)
- Put X in A2:A?, Y in B2:B?.
- Pick two cells for the unknowns, say F2 = β₀, F3 = β₁. Put any starting guesses (e.g., 50 and 5).
- In C2 type: =$F$2 + $F$3*A2 → fill down (this is Ŷ).
- In D2 type: =B2 - C2 → fill down (residual).
- In a cell for SSE (e.g., F5): =SUMSQ(D2:D?).
- Data → Solver: Set Objective = F5, To = Min, By Changing Cells = F2:F3, Method = GRG Nonlinear → Solve.
Solver will return the same β̂₀, β̂₁ you got above, because minimizing SSE = maximizing likelihood under Normal errors.
6) Visual — Scatter, OLS line, and residual segments
After you press Compute, the scatter updates, the blue line is Ŷ = β̂₀ + β̂₁X, and each orange segment is a residual (Y−Ŷ). OLS chooses β̂ to make these orange segments, squared and summed, as small as possible.