📘 Section 12.5 – Model Adequacy Checking

🔎 12.5.1 Residual Analysis

Residual: eᵢ = yᵢ − ŷᵢ

Example: d₁₅ = 5.84 / √5.2352 = 2.55
d₁₇ = 4.33 / √5.2352 = 1.89
r₁₅ = 5.84 / √(5.2352 × (1 − 0.0737)) = 2.65
r₁₇ = 4.33 / √(5.2352 × (1 − 0.2593)) = 2.20

🎯 Hat Matrix and Leverage

Hat Matrix: H = X(X′X)⁻¹X′

Leverage: hᵢᵢ = x′ᵢ(X′X)⁻¹xᵢ

🔥 12.5.2 Influential Observations

Cook’s Distance identifies influential points in regression analysis:

Dᵢ = r²ᵢ × [hᵢᵢ / (p × (1 − hᵢᵢ))]

If Dᵢ > 1 → point is potentially influential

ObservationhᵢᵢDᵢ
10.15730.035
150.07370.187
170.25930.565

📘 Summary

🎓 Student GPA Example

We are modeling GPA based on:

The multiple regression model is:

GPA = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + ε

Suppose after fitting, we have the following estimated residuals and leverage:

Studentŷᵢyᵢeᵢhᵢᵢ
53.02.3-0.70.12
123.44.00.60.08
172.54.11.60.28

Let MSE = 0.45, p = 4 (3 predictors + intercept)

🔎 Residual Analysis

d₁₇ = 1.6 / √0.45 = 2.38
r₁₇ = 1.6 / √(0.45 × (1 − 0.28)) = 2.64

🎯 Hat Matrix and Leverage

Leverage: hᵢᵢ = x′ᵢ(X′X)⁻¹xᵢ

🔥 Cook’s Distance for GPA Data

Formula:

Dᵢ = r²ᵢ × [hᵢᵢ / (p × (1 − hᵢᵢ))]
D₁₇ = (2.64)² × [0.28 / (4 × (1 − 0.28))] = 6.97 × 0.0972 = 0.678

Since D₁₇ < 1, the point is not highly influential, but it should be monitored.

📘 Summary