In multiple linear regression (MLR), writing out the equations becomes cumbersome with more predictors. The matrix form provides a compact, efficient way to represent and compute the model, especially useful for software tools like Excel, R, or Python.
The least squares estimator minimizes the residual sum of squares:
We model GPA as a function of study hours (x₁), sleep hours (x₂), and attendance (x₃):
| Student | GPA (Y) | Study (x₁) | Sleep (x₂) | Attendance (x₃) | 
|---|---|---|---|---|
| 1 | 3.5 | 15 | 7 | 40 | 
| 2 | 3.8 | 20 | 8 | 42 | 
| 3 | 2.9 | 10 | 6 | 35 | 
| 4 | 3.2 | 12 | 6.5 | 38 | 
| 5 | 3.7 | 18 | 7.5 | 41 | 
Design Matrix X (5×4):
X = [1 15 7.0 40] [1 20 8.0 42] [1 10 6.0 35] [1 12 6.5 38] [1 18 7.5 41]
Response Vector Y (5×1):
Y = [3.5 3.8 2.9 3.2 3.7]
[5 75 35.0 196] [75 1173 546.5 2988] [35 546.5 257.5 1391] [196 2988 1391 11510]
[17.1] [262.2] [122.1] [672.4]
β̂ = [1.256 0.082 0.291 0.014]
Final model: GPA = 1.256 + 0.082×Study + 0.291×Sleep + 0.014×Attendance
Assumptions: Errors εᵢ are independent, E(εᵢ) = 0, and Var(εᵢ) = σ².
Unbiasedness:
E[β̂] = E[(X′X)−1X′Y] = β
Because E(ε) = 0 and (X′X)−1X′X = I.
Covariance Matrix of β̂:
Cov(β̂) = σ² (X′X)−1 = σ² C
Standard Error:
se(β̂ⱼ) = √(σ̂² Cⱼⱼ)
Interpretation: Small se(β̂ⱼ) implies high precision. Computer output typically includes se values.