ENGR 200 – Final Part II (Complete Excel LINEST Solution)

Model: Y = FinalProjectScore, X₁ = DesignLabHours, X₂ = HomeworkQuality, n = 23. Goal: use LINEST (no ToolPak) and build a table with β, SE, t, p together, plus VIF and conceptual answers.
📂 Data file for this solution: final-data.xlsx (Place this Excel file in the same folder as this HTML page, then open it in Excel.)

Step 0 – Data layout in Excel

Open final-data.xlsx. The sheet should have:

We define: Y = FinalProjectScore, X₁ = DesignLabHours, X₂ = HomeworkQuality.

Step 1 – Put LINEST output in a block (J2:L6)

1. Select the 5×3 block J2:L6.
2. Type the LINEST formula:

=LINEST($C$2:$C$24,$A$2:$B$24,TRUE,TRUE)

3. Confirm as an array formula:
  • Older Excel: press Ctrl + Shift + Enter.
  • Excel 365: press Enter (it spills automatically).

For two X’s, LINEST fills J2:L6 as:

CellMeaning
J2b₂ (slope for X₂ = HomeworkQuality)
K2b₁ (slope for X₁ = DesignLabHours)
L2b₀ (intercept)
J3Std error of b₂
K3Std error of b₁
L3Std error of b₀
J4
K4Standard error of Y
J5F statistic
K5df (error df = n − k − 1)
J6Regression SS (SSR)
K6Residual SS (SSE)

Step 2 – Summary table with β, SE, t Stat, p-value together

Create a clean output table (like ToolPak) in D10:H12:

Row Term (in D) β (Coefficient) (E) Std Error (F) t Stat (G) p-value 2-sided (H)
10 Intercept (b₀) =L2 =L3 =L2/L3 =2*(1-T.DIST(ABS(L2/L3),$K$5,TRUE))
11 X₁ – DesignLabHours (b₁) =K2 =K3 =K2/K3 =2*(1-T.DIST(ABS(K2/K3),$K$5,TRUE))
12 X₂ – HomeworkQuality (b₂) =J2 =J3 =J2/J3 =2*(1-T.DIST(ABS(J2/J3),$K$5,TRUE))

Now β, SE, t, p for each coefficient are in the same row just like the ToolPak, but everything comes from one LINEST call.

Numeric result of the summary table

Term β Std Error t Stat p-value
Intercept 36.5564 8.1440 4.4887 0.000225
X₁ – DesignLabHours 4.0231 1.3649 2.9474 0.00796
X₂ – HomeworkQuality 3.1777 1.3615 2.3340 0.0301

Step 3 – R², F, sums of squares, prediction

3.1 R² and Adjusted R²

3.2 SSE, SSR, SST

3.3 F statistic and Significance F

3.4 Prediction for a specific X₁, X₂

Put X₁* and X₂* in cells:

=$L$2 + $K$2*K20 + $J$2*L20

For X₁ = 8, X₂ = 9, Excel gives Ŷ ≈ 97.34.

Step 4 – Corr(X₁, X₂) and VIF

4.1 Corr(X₁, X₂)

In e.g. D19, correlation between DesignLabHours and HomeworkQuality:

=CORREL($A$2:$A$24,$B$2:$B$24)

→ Corr ≈ 0.7437

4.2 VIF for both predictors

With only two predictors, both share the same VIF: VIF = 1 / (1 − r²), where r = Corr(X₁, X₂).

In e.g. E19 (if D19 holds the correlation):

=1/(1-D19^2)

→ VIF ≈ 2.2377 for both X₁ and X₂.

4.3 Why we used CORREL here (and when you must use RSQ)

In general, the Variance Inflation Factor for a predictor Xj is defined as VIFj = 1 / (1 − Rj2), where Rj2 is the R-squared from the auxiliary regression of Xj on all the other X’s. So VIF is officially based on an R², not directly on a correlation.

In our model we only have two predictors, X₁ and X₂. To get VIF for X₁, we regress X₁ on X₂ (a simple regression with one X). In a simple regression with an intercept, the R² equals the squared correlation: R² = r². That means:

This is why, with only two X’s, we can compute VIF as =1/(1 - CORREL(A2:A24,B2:B24)^2). It is exactly the same as using the RSQ from regressing one X on the other.

For models with three or more predictors, pairwise correlations are not enough. You must regress each X on all the other X’s and use that regression’s R² in VIF = 1 / (1 − R²). In this exam we stay with the two-predictor case, so the CORREL-based shortcut is valid and easy to use.

Step 5 – Answers to exam questions (using the Excel output)

Q1. Which predictors are statistically significant at α = 0.05? How do you know?
From the summary table: Both p-values are below 0.05. Answer: Both X₁ and X₂ are statistically significant at the 5% level.
Q2. Interpret b₁ (DesignLabHours).
b₁ ≈ 4.0231.
Interpretation: Holding HomeworkQuality constant, each additional 1 hour in the design lab is associated with an increase of about 4.02 points in the predicted FinalProjectScore on average.
Q3. Interpret b₂ (HomeworkQuality).
b₂ ≈ 3.1777.
Interpretation: Holding DesignLabHours constant, each additional 1-point increase in HomeworkQuality is associated with an increase of about 3.18 points in the predicted FinalProjectScore on average.
Q4. Interpret R². Is the model a good fit?
R² ≈ 0.7316.
Interpretation: About 73% of the variation in FinalProjectScore is explained by DesignLabHours and HomeworkQuality together.
For real education data, 73% explained variance is strong. Answer: Yes, this is a good fit.
Q5. Use the global F-test to decide if the regression is useful overall.
From LINEST: Hypotheses: Since the p-value is extremely small, we reject H₀.
Answer: The regression is statistically significant and useful overall.
Q6. Based on VIF and Corr(X₁, X₂), is multicollinearity a concern?
Answer: There is some correlation between the predictors, but the VIF is low. Multicollinearity is not a serious concern in this model.
Q7. Predict Y when X₁ = 8 and X₂ = 9.
Use the estimated equation:
Ŷ = 36.5564 + 4.0231·X₁ + 3.1777·X₂

Plug in X₁ = 8, X₂ = 9: Sum:
36.5564 + 32.1848 + 28.5993 ≈ 97.34

Answer: The predicted FinalProjectScore is about 97.34 points.