Data → Data Analysis → Regression / Correlation).
For the “no ToolPak, all Excel functions + matrix formula” version of
Chapter 12 MLR, use the companion page
👉
Ch12 — MLR (GPA) • Excel ICE (functions only)
.
1. Click Copy CSV. 2. In Excel, click into cell A1 (not edit mode) and press Ctrl+V. You should see A–D as StudyHW, TutorHours, ExamPrep, FinalGrade and rows 1..24 (23 students).
First, use Data → Data Analysis → Correlation to let Excel create the full 4×4 correlation matrix for you.
Now recreate that same 4×4 table yourself on the original data sheet using
CORREL. We’ll use a blank block in F1:J5.
Assume:
A = StudyHW, B = TutorHours, C = ExamPrep, D = FinalGrade, rows 2..24 are students.
Labels:
F1: (leave blank)
G1: StudyHW
H1: TutorHours
I1: ExamPrep
J1: FinalGrade
F2: StudyHW
F3: TutorHours
F4: ExamPrep
F5: FinalGrade
Diagonal (correlation of a variable with itself):
G2: 1
H3: 1
I4: 1
J5: 1
Upper triangle — each CORREL uses two column ranges:
H2: =CORREL($A$2:$A$24,$B$2:$B$24) // r(StudyHW,TutorHours)
I2: =CORREL($A$2:$A$24,$C$2:$C$24) // r(StudyHW,ExamPrep)
J2: =CORREL($A$2:$A$24,$D$2:$D$24) // r(StudyHW,FinalGrade)
I3: =CORREL($B$2:$B$24,$C$2:$C$24) // r(TutorHours,ExamPrep)
J3: =CORREL($B$2:$B$24,$D$2:$D$24) // r(TutorHours,FinalGrade)
J4: =CORREL($C$2:$C$24,$D$2:$D$24) // r(ExamPrep,FinalGrade)
// Lower triangle — mirror the upper triangle:
G3: =H2
G4: =I2
G5: =J2
H4: =I3
H5: =J3
I5: =J4
CORREL has exactly two ranges:
first variable’s column, second variable’s column. This hand-built F1:J5
table should match the ToolPak correlation matrix from part D1.
ToolPak does not show VIF. We can compute VIF from VIFj = 1 / (1 − R²j), where R²j is from regressing Xj on the other X’s (with intercept).
With our layout (StudyHW=A, TutorHours=B, ExamPrep=C, rows 2..24), use:
AA10: "VIF StudyHW"
AB10: =1/(1-INDEX(LINEST($A$2:$A$24,$B$2:$C$24,TRUE,TRUE),3,1))
AA11: "VIF TutorHours"
AB11: =1/(1-INDEX(
LINEST($B$2:$B$24,
CHOOSE({1,2},$A$2:$A$24,$C$2:$C$24),
TRUE,TRUE),
3,1))
AA12: "VIF ExamPrep"
AB12: =1/(1-INDEX(
LINEST($C$2:$C$24,
CHOOSE({1,2},$A$2:$A$24,$B$2:$B$24),
TRUE,TRUE),
3,1))
Students can copy this and fill in the numbers from the ToolPak output.
This part is for students who want to see the math behind what Excel is doing: OLS multiple regression, correlation, and VIF.
Our model is
FinalGrade = β₀ + β₁·StudyHW + β₂·TutorHours + β₃·ExamPrep + ε.
In matrix form we write:
Y = n×1 column of FinalGrade values
X = n×p design matrix (columns: 1, StudyHW, TutorHours, ExamPrep)
β = p×1 column of unknown coefficients
ε = n×1 column of errors
Model: Y = Xβ + ε
Ordinary Least Squares (OLS) chooses β to minimize the sum of squared residuals
Σ eᵢ² = Σ (yᵢ − ŷᵢ)². The solution is the famous closed-form formula:
β̂ = (XᵀX)⁻¹ XᵀY
From this we get predictions ŷ = Xβ̂, residuals e = Y − ŷ, and the error variance estimate
MSE = SSE / (n − p), where SSE = Σeᵢ² and p = number of parameters
(here p = 4: intercept + 3 slopes).
The sample correlation between variables X and Y is:
r(X,Y) = [ Σ (xᵢ − x̄)(yᵢ − ȳ) ] / [ (n−1)·sₓ·sᵧ ]
where:
x̄ = sample mean of X
ȳ = sample mean of Y
sₓ² = Σ (xᵢ − x̄)² / (n−1) (sample variance of X)
sᵧ² = Σ (yᵢ − ȳ)² / (n−1) (sample variance of Y)
In Excel, =CORREL(X_range, Y_range) computes this r directly.
The 4×4 correlation matrix is just all pairwise r’s arranged in a table.
For each predictor Xj in a multiple regression, the Variance Inflation Factor (VIF) tells you how much the variance of its slope β̂j is inflated by collinearity with the other X’s.
Let Rj2 be the R-squared from regressing Xj on all the other predictors (with an intercept). Then
VIFⱼ = 1 / (1 − Rⱼ²)
If there were no collinearity, the variance of β̂j would look like
σ² / Σ(xⱼᵢ − x̄ⱼ)². With other correlated predictors, it becomes
Var(β̂ⱼ) = [σ² / Σ(xⱼᵢ − x̄ⱼ)²] · VIFⱼ
So VIFⱼ is literally the factor by which the variance (and standard error) of the slope is blown up by multicollinearity.
Rules of thumb:
If you’ve never seen regression before, watch this short video first. It shows the basic idea of fitting a line before we jump into multiple regression and matrix formulas.
To connect the correlation matrix and VIF to real data problems, you can also watch this short video on multicollinearity.
© Simple MLR ToolPak • Single-file, offline.