This page is a cheat sheet for the Excel functions and formulas used in
Chapter 12 apps (MLR, VIF, Cook’s Distance, Adjusted R²). It is not a new
assignment — just a quick reference so you don’t have to hunt through slides.
Notation in the formulas:
| Quantity / Function | Excel formula (generic pattern) | What it means / Example |
|---|---|---|
| Slope β₁ (simple regression) |
=SLOPE(Y_range, X1_range)
|
Returns the slope of the line predicting Y from one X. Example: =SLOPE(D2:D24, A2:A24) gives how many grade points
FinalGrade changes when StudyHW increases by 1 hour.
|
| Intercept β₀ (simple regression) |
=INTERCEPT(Y_range, X1_range)
|
Predicted Y when X = 0. Example: =INTERCEPT(D2:D24, A2:A24) gives the predicted
FinalGrade when StudyHW = 0.
|
| Multi-X β’s via LINEST (slopes & intercept) |
=LINEST(Y_range, X_range, TRUE, TRUE)Use with INDEX to pull specific pieces, e.g.:=INDEX(LINEST(Y_range, X_range, TRUE, TRUE), 1, 1)
|
LINEST with TRUE,TRUE returns an array containing:
slopes, intercept, standard errors, R², F, etc.In practice, we usually use the matrix-formula approach in the ICE, but LINEST is a built-in alternative for multi-predictor MLR.
|
| Matrix formula for β̂ (multi-X) |
Assuming you built the design matrix X (including a column of 1’s) in
F2:I24 and Y in D2:D24, you can use:
=MMULT(
|
Computes the 4×1 vector \( \hat\beta = (X'X)^{-1}X'Y \) used in the functions-only ICE app. The result spills into 4 cells: intercept, β(StudyHW), β(TutorHours), β(ExamPrep). |
| Predicted value ŷ |
If β’s are in B1:B3 and X’s in row 2:
= $B$1 + $B$2*StudyHours_2 + $B$3*SleepHours_2
|
Plug the fitted coefficients into the model to get the predicted score for each student. In the ICE, this is the “yhat” column. |
| Residual e |
= Y_cell - yhat_cell
Example: =D2 - L2
|
Actual minus predicted: how far off the model is for that student. Residuals should bounce randomly around 0 if the model is reasonable. |
| SSE (Sum of Squared Errors) |
=SUMSQ(Residual_range)Example: =SUMSQ(M2:M24)
|
Adds up e² across all students. Smaller SSE means the line
is closer to the points overall.
|
| Residual degrees of freedom |
= n - pExample (23 students, 3 slopes + intercept): =23 - 4 → 19
|
df = number of data points minus number of parameters in the model. Used for MSE, t-tests, and confidence intervals. |
| MSE and Root MSE |
MSE: = SSE_cell / df_cellRoot MSE: =SQRT(MSE_cell)
|
MSE is the average squared residual; Root MSE is in the same units as Y (e.g., grade points). Root MSE ≈ typical prediction error. |
| R² (coefficient of determination) |
Simple regression:=RSQ(Y_range, X1_range)From predictions: =RSQ(Y_range, Yhat_range)
|
Percent of variation in Y explained by the model. Example: R² = 0.72 → about 72% of the variation in test scores is explained by the predictors. |
| Adjusted R² |
If R² is in C2, n in C3, p in C4:
=1 - (1 - C2) * (C3 - 1) / (C3 - C4)
|
Adjusted R² penalizes adding “junk” predictors. It can go down if a new X doesn’t help much. Use it when comparing models with different numbers of predictors. |
| t-statistic for a slope βⱼ |
If βⱼ is in B2 and its SE is in C2:
= B2 / C2
|
Measures how many SE’s the estimated slope is away from 0. Large |t| usually means a small p-value and stronger evidence that the slope ≠ 0. |
| p-value for slope (two-sided) |
If t is in D2 and df in E2:
=T.DIST.2T(ABS(D2), E2)
|
Probability of seeing a t-statistic this extreme (or more) if the true slope were 0. If p < 0.05 → slope is statistically significant at 5%. |
| 95% confidence interval for βⱼ |
With βⱼ in B2, SE in C2, α in 0.05,
df in E2:Critical t: =T.INV.2T(0.05, E2)CI lower: =B2 - tcrit_cell * C2CI upper: =B2 + tcrit_cell * C2
|
Gives a range of plausible values for the true slope. If 0 is not inside the interval, the slope is significant at α = 0.05. |
| Quantity / Function | Excel formula | What it means |
|---|---|---|
| Correlation r(X, Y) |
=CORREL(X_range, Y_range)Example: =CORREL(A2:A24, D2:D24)
|
Measures strength and direction of linear association (–1 ≤ r ≤ 1). Rough guide:
|
| Correlation matrix (4×4, by hand) |
Use =CORREL() for each pair and mirror across the diagonal.Example: =CORREL($A$2:$A$24, $B$2:$B$24) → r(StudyHW,TutorHours)
|
Table of r’s for all pairs of variables (StudyHW, TutorHours, ExamPrep, FinalGrade). Large |r| among X’s signals possible multicollinearity. |
| VIF for a predictor Xⱼ (from R² of Xⱼ on other X’s) |
1. Get R²ⱼ from regressing Xⱼ on other X’s (can use RSQ or
LINEST).2. Then compute: =1 / (1 - R2j_cell)
|
VIF tells how much the variance of βⱼ is inflated by collinearity.
|
| VIF via LINEST (example) |
Example for StudyHours (X₁) regressed on SleepHours (X₂):
=1/(1-INDEX(
|
Here row 3, column 1 of the LINEST output is R² for the
regression of X₁ on X₂. Plug into 1/(1−R²) to get VIF for X₁.
|
| Quantity / Formula | Excel expression (pattern) | What it means / How to use |
|---|---|---|
| Standardized residual rᵢ |
If residual eᵢ is in E2, leverage hᵢᵢ in F2,
MSE in $P$4:
= E2 / ( SQRT($P$4) * SQRT(1 - F2) )
|
Rescales residuals so they are comparable across observations, taking into account overall spread (MSE) and leverage. Large |rᵢ| indicates a point that doesn’t fit the line well. |
| Leverage hᵢᵢ |
In the model-adequacy ICE, leverage is usually computed via matrix
formulas; you can store hᵢᵢ in its own column (say F2:F24).
|
Measures how “far” the X-pattern of observation i is from the center of the X’s. High leverage points can pull the fitted line toward them. |
| Cook’s Distance Dᵢ (using rᵢ and hᵢᵢ) |
With rᵢ in G2, hᵢᵢ in F2, and p = number of
parameters (intercept + slopes) in $Q$2:
= (G2^2 * F2) / ( $Q$2 * (1 - F2) )
|
Cook’s Distance measures how much all fitted β’s would change if
observation i were removed. Rules of thumb:
|
| Alternative Cook’s D formula (using eᵢ directly) |
If you have raw residual eᵢ (E2), leverage hᵢᵢ (F2), MSE (P4),
and p = number of parameters (Q2):
= (E2^2 * F2) / ( $Q$2 * $P$4 * (1 - F2)^2 )
|
This is algebraically equivalent to the rᵢ-based formula above. Use whichever matches the quantities you’ve already computed in your sheet. |
💡 How to use this page: When you’re working on Chapter 12 homework or ICE apps, keep this open as a “formula menu.” Find the quantity you need (β, R², Root MSE, correlation, VIF, or Cook’s D), copy the pattern, and then swap in your own cell ranges.