📘 Section 12.5 – Model Adequacy Checking

🔎 12.5.1 Residual Analysis

Residual: eᵢ = yᵢ − ŷᵢ

Use residual plots to detect non-linearity or omitted variables
Standardized Residual: dᵢ = eᵢ / √MSE
Studentized Residual: rᵢ = eᵢ / √(MSE × (1 − hᵢᵢ))

Example: d₁₅ = 5.84 / √5.2352 = 2.55
d₁₇ = 4.33 / √5.2352 = 1.89
r₁₅ = 5.84 / √(5.2352 × (1 − 0.0737)) = 2.65
r₁₇ = 4.33 / √(5.2352 × (1 − 0.2593)) = 2.20

🎯 Hat Matrix and Leverage

Hat Matrix: H = X(X′X)⁻¹X′

Leverage: hᵢᵢ = x′ᵢ(X′X)⁻¹xᵢ

🔥 12.5.2 Influential Observations

Cook’s Distance identifies influential points in regression analysis:

Dᵢ = r²ᵢ × [hᵢᵢ / (p × (1 − hᵢᵢ))]

If Dᵢ > 1 → point is potentially influential

Observation	hᵢᵢ	Dᵢ
1	0.1573	0.035
15	0.0737	0.187
17	0.2593	0.565

📘 Summary

Residual plots help assess model fit and nonlinearity
Standardized and studentized residuals detect outliers
Hat matrix shows leverage
Cook’s Distance finds influential data points

🎓 Student GPA Example

We are modeling GPA based on:

x₁: Number of study hours per week
x₂: Total number of credits this semester
x₃: Sleep hours per night

The multiple regression model is:

GPA = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + ε

Suppose after fitting, we have the following estimated residuals and leverage:

Student	ŷᵢ	yᵢ	eᵢ	hᵢᵢ
5	3.0	2.3	-0.7	0.12
12	3.4	4.0	0.6	0.08
17	2.5	4.1	1.6	0.28

Let MSE = 0.45, p = 4 (3 predictors + intercept)

🔎 Residual Analysis

Standardized Residual: dᵢ = eᵢ / √MSE
Studentized Residual: rᵢ = eᵢ / √(MSE × (1 − hᵢᵢ))

d₁₇ = 1.6 / √0.45 = 2.38
r₁₇ = 1.6 / √(0.45 × (1 − 0.28)) = 2.64

🎯 Hat Matrix and Leverage

Leverage: hᵢᵢ = x′ᵢ(X′X)⁻¹xᵢ

🔥 Cook’s Distance for GPA Data

Formula:

Dᵢ = r²ᵢ × [hᵢᵢ / (p × (1 − hᵢᵢ))]

D₁₇ = (2.64)² × [0.28 / (4 × (1 − 0.28))] = 6.97 × 0.0972 = 0.678

Since D₁₇ < 1, the point is not highly influential, but it should be monitored.

📘 Summary

Use residual plots to assess fit and detect outliers
Standardized residuals scale residuals for comparison
Studentized residuals adjust for leverage
Cook’s Distance flags influential points