ENGR 200 – Final Part II (Complete Excel LINEST Solution)
Model: Y = FinalProjectScore, X₁ = DesignLabHours, X₂ = HomeworkQuality, n = 23.
Goal: use LINEST (no ToolPak) and build a table with β, SE, t, p together, plus VIF and conceptual answers.
📂
Data file for this solution:
final-data.xlsx
(Place this Excel file in the
same folder as this HTML page, then open it in Excel.)
Step 0 – Data layout in Excel
Open final-data.xlsx. The sheet should have:
- A1: DesignLabHours, data in A2:A24
- B1: HomeworkQuality, data in B2:B24
- C1: FinalProjectScore, data in C2:C24
- There are n = 23 students (rows 2–24).
We define: Y = FinalProjectScore, X₁ = DesignLabHours, X₂ = HomeworkQuality.
Step 1 – Put LINEST output in a block (J2:L6)
1. Select the 5×3 block J2:L6.
2. Type the LINEST formula:
=LINEST($C$2:$C$24,$A$2:$B$24,TRUE,TRUE)
3. Confirm as an array formula:
• Older Excel: press Ctrl + Shift + Enter.
• Excel 365: press Enter (it spills automatically).
For two X’s, LINEST fills J2:L6 as:
| Cell | Meaning |
| J2 | b₂ (slope for X₂ = HomeworkQuality) |
| K2 | b₁ (slope for X₁ = DesignLabHours) |
| L2 | b₀ (intercept) |
| J3 | Std error of b₂ |
| K3 | Std error of b₁ |
| L3 | Std error of b₀ |
| J4 | R² |
| K4 | Standard error of Y |
| J5 | F statistic |
| K5 | df (error df = n − k − 1) |
| J6 | Regression SS (SSR) |
| K6 | Residual SS (SSE) |
Step 2 – Summary table with β, SE, t Stat, p-value together
Create a clean output table (like ToolPak) in D10:H12:
| Row |
Term (in D) |
β (Coefficient) (E) |
Std Error (F) |
t Stat (G) |
p-value 2-sided (H) |
| 10 |
Intercept (b₀) |
=L2 |
=L3 |
=L2/L3 |
=2*(1-T.DIST(ABS(L2/L3),$K$5,TRUE)) |
| 11 |
X₁ – DesignLabHours (b₁) |
=K2 |
=K3 |
=K2/K3 |
=2*(1-T.DIST(ABS(K2/K3),$K$5,TRUE)) |
| 12 |
X₂ – HomeworkQuality (b₂) |
=J2 |
=J3 |
=J2/J3 |
=2*(1-T.DIST(ABS(J2/J3),$K$5,TRUE)) |
Now β, SE, t, p for each coefficient are in the same row just like the ToolPak, but everything comes from one LINEST call.
Numeric result of the summary table
| Term |
β |
Std Error |
t Stat |
p-value |
| Intercept |
36.5564 |
8.1440 |
4.4887 |
0.000225 |
| X₁ – DesignLabHours |
4.0231 |
1.3649 |
2.9474 |
0.00796 |
| X₂ – HomeworkQuality |
3.1777 |
1.3615 |
2.3340 |
0.0301 |
Step 3 – R², F, sums of squares, prediction
3.1 R² and Adjusted R²
- R² in e.g. D15:
=J4 → ≈ 0.7316
-
With n = 23, k = 2 predictors, Adjusted R² in e.g. E15:
=1-(1-$J$4)*(23-1)/(23-2-1)
→ ≈ 0.7048
3.2 SSE, SSR, SST
- SSE (Residual SS) in e.g. D16:
=K6 → ≈ 1443.2263
- SSR (Regression SS) in e.g. E16:
=J6 → ≈ 3934.8232
- SST in e.g. F16:
=J6+K6 → ≈ 5378.0496
3.3 F statistic and Significance F
- F in e.g. D17:
=J5 → ≈ 27.2641 (df₁ = 2, df₂ = 20)
-
Significance F (overall p-value) in e.g. E17:
=F.DIST.RT(J5,2,$K$5)
→ ≈ 1.94E-06 (≈ 0.00000194)
3.4 Prediction for a specific X₁, X₂
Put X₁* and X₂* in cells:
- K20: X₁* (e.g. 8)
- L20: X₂* (e.g. 9)
- M20: prediction Ŷ*:
=$L$2 + $K$2*K20 + $J$2*L20
For X₁ = 8, X₂ = 9, Excel gives Ŷ ≈ 97.34.
Step 4 – Corr(X₁, X₂) and VIF
4.1 Corr(X₁, X₂)
In e.g. D19, correlation between DesignLabHours and HomeworkQuality:
=CORREL($A$2:$A$24,$B$2:$B$24)
→ Corr ≈ 0.7437
4.2 VIF for both predictors
With only two predictors, both share the same VIF:
VIF = 1 / (1 − r²), where r = Corr(X₁, X₂).
In e.g. E19 (if D19 holds the correlation):
=1/(1-D19^2)
→ VIF ≈ 2.2377 for both X₁ and X₂.
4.3 Why we used CORREL here (and when you must use RSQ)
In general, the Variance Inflation Factor for a predictor Xj is defined as
VIFj = 1 / (1 − Rj2), where Rj2 is the R-squared
from the auxiliary regression of Xj on all the other X’s.
So VIF is officially based on an R², not directly on a correlation.
In our model we only have two predictors, X₁ and X₂. To get VIF for X₁, we regress X₁ on X₂
(a simple regression with one X). In a simple regression with an intercept,
the R² equals the squared correlation:
R² = r². That means:
- R² for regressing X₁ on X₂ = CORREL(X₁, X₂)²
- R² for regressing X₂ on X₁ = CORREL(X₁, X₂)² as well
This is why, with only two X’s, we can compute VIF as
=1/(1 - CORREL(A2:A24,B2:B24)^2).
It is exactly the same as using the RSQ from regressing one X on the other.
For models with three or more predictors, pairwise correlations are not enough.
You must regress each X on all the other X’s and use that regression’s R² in
VIF = 1 / (1 − R²). In this exam we stay with the two-predictor case, so the
CORREL-based shortcut is valid and easy to use.
Step 5 – Answers to exam questions (using the Excel output)
Q1. Which predictors are statistically significant at α = 0.05? How do you know?
From the summary table:
- X₁ (DesignLabHours): p ≈ 0.00796 < 0.05
- X₂ (HomeworkQuality): p ≈ 0.0301 < 0.05
Both p-values are below 0.05.
Answer: Both X₁ and X₂ are statistically significant at the 5% level.
Q2. Interpret b₁ (DesignLabHours).
b₁ ≈ 4.0231.
Interpretation: Holding HomeworkQuality constant, each additional 1 hour in the design lab
is associated with an increase of about
4.02 points in the predicted FinalProjectScore on average.
Q3. Interpret b₂ (HomeworkQuality).
b₂ ≈ 3.1777.
Interpretation: Holding DesignLabHours constant, each additional 1-point increase in HomeworkQuality
is associated with an increase of about
3.18 points in the predicted FinalProjectScore on average.
Q4. Interpret R². Is the model a good fit?
R² ≈ 0.7316.
Interpretation: About
73% of the variation in FinalProjectScore is explained by
DesignLabHours and HomeworkQuality together.
For real education data, 73% explained variance is strong.
Answer: Yes, this is a
good fit.
Q5. Use the global F-test to decide if the regression is useful overall.
From LINEST:
- F ≈ 27.2641 with df₁ = 2 and df₂ = 20
- Significance F ≈ 1.94 × 10⁻⁶ < 0.05
Hypotheses:
- H₀: β₁ = β₂ = 0 (no linear relationship; regression not useful)
- H₁: At least one β is nonzero (regression useful)
Since the p-value is extremely small, we reject H₀.
Answer: The regression is
statistically significant and useful overall.
Q6. Based on VIF and Corr(X₁, X₂), is multicollinearity a concern?
- Corr(X₁, X₂) ≈ 0.7437 (moderately high, but not near 1)
- VIF ≈ 2.2377 (well below common concern thresholds like 5 or 10)
Answer: There is some correlation between the predictors, but the VIF is low.
Multicollinearity is
not a serious concern in this model.
Q7. Predict Y when X₁ = 8 and X₂ = 9.
Use the estimated equation:
Ŷ = 36.5564 + 4.0231·X₁ + 3.1777·X₂
Plug in X₁ = 8, X₂ = 9:
- 4.0231 × 8 = 32.1848
- 3.1777 × 9 = 28.5993
Sum:
36.5564 + 32.1848 + 28.5993 ≈
97.34
Answer: The predicted FinalProjectScore is about
97.34 points.