Chapter 11 – ICE — Excel-First Regression (No Math)

Prefer the one-click Excel summary? 👉 Use the ToolPak version.
Do the whole simple linear regression in Excel: compute the line, check fit, make a forecast. App also shows t-stats, p-values, and ANOVA F + a live guide for judging model quality. (Derivations are on the Math ICE.)

Intro — What problem are we solving?

Regression finds a straight-line rule y = b0 + b1·x to predict Y from X. Example: “If a student studies more hours, how much might their exam score increase?” This page stays Excel-first: functions, a trendline, and a forecast at your chosen x₀. The Math ICE shows why OLS = MLE under Normal noise.

Step A — Data (X, Y)

Type or paste below, or click “Generate data.” Columns: X, Y.
#XY
Tip: After copying to Excel, put X in A2:AN and Y in B2:BN. Paste by single-clicking A1 → Ctrl+V. If needed: Data → Text to Columns → Delimited → Tab.

Step B — Excel functions you’ll use

Copy with the equals sign so results calculate immediately.
Core formulas (copy me ⤵)
Slope (b1):
Intercept (b0):
R²:
Correlation (r):
Std error of y|x (STEYX):
Predicted y at x = :
Residuals (if b0 in E2, b1 in E3):
Full stats via LINEST (enter as dynamic array in a 5×2 block):

Step C — Visual check (should match Excel)

✅ Data ready ✅ SLOPE & INTERCEPT ✅ RSQ & CORREL ✅ Forecast at x₀ ✅ Residuals

Step D — How to do this in Excel (student checklist)

  1. Paste data: X in A2:AN, Y in B2:BN.
  2. Compute =SLOPE, =INTERCEPT, =RSQ, =CORREL, =STEYX, =FORECAST.LINEAR.
  3. Insert → Scatter; add Trendline and display equation & R².
  4. Add residuals: =B2 - ($E$2 + $E$3*A2). Check curve/outliers.
  5. Interpret: slope units, intercept meaning, R² strength, typical error size, and whether x₀ is inside the X-range.
Judge fit: (i) high, (ii) STEYX small (Y-units), (iii) p(b₁) small, (iv) p(F) small, (v) residuals look random (no curve/funnel/outliers).

Step E — Plain-English explanation of your results

Step F — Significance & ANOVA

t & p for slope b₁

t = b₁ / SE(b₁), df = n−2
t(b₁):
p(two-tailed):
Excel (copy-paste)
t(b₁):
df:
p(two-tailed):

t & p for intercept b₀

t = b₀ / SE(b₀)
t(b₀):
p(two-tailed):
Excel (copy-paste)
t(b₀):
p(two-tailed):

ANOVA — overall model

F = MSR / MSE; df₁=1, df₂=n−2
F:
p(F):
Excel (copy-paste)
F:
p(F):
SourceSSdfMSFp
Regression1
Error
Total
How to read this table: SST is total variation in Y (about its mean). SSR is the part explained by the line. SSE is the leftover error. Divide SS by df to get MS. The F test compares explained vs leftover variation (MSR/MSE). A small p(F) means the line captures real signal (reject H₀: β₁=0). In simple regression, F = t(b₁)².

Step G — How to tell if the model is “good” (live with your numbers)

Hypotheses & Excel formulas (students can copy)
Slope test (t) — H₀: β₁=0, H₁: β₁≠0 (two-tailed).
t = b1 / SE(b1), df = n−2, p = T.DIST.2T(ABS(t), df)
With LINEST:
t(b1):
df:
p(two-tailed):
Overall model (F) — H₀: “no linear signal” (β₁=0).
F = MSR/MSE, p(F)=F.DIST.RT(F,1,n−2)
F:
p(F):
SS pieces in Excel: SST=DEVSQ(Y), SSR=RSQ(Y,X)*DEVSQ(Y), SSE=(n−2)*STEYX(Y,X)^2

Step H — Python (single script you can run or share)

Run: python ch11_regression_report.py. Optional: --csv your.csv --xcol X --ycol Y --x0 10.

Write-up template (auto-filled)

Q&A — Getting SE(β₀), SE(β₁) with LINEST

Q: How do I get the standard error of the slope or intercept individually in Excel?

A: Use LINEST(known_y, known_x, const, stats) with the last two arguments set to TRUE so Excel returns the full 5×2 stats array, then pick out the piece you want with INDEX.

  • SE(β₁) (slope):
    =INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 2, 1)
  • SE(β₀) (intercept):
    =INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 2, 2)
  • t(β₁):
    =INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 1, 1) / INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 2, 1)
  • p(two-tailed) for slope:
    =T.DIST.2T(ABS( <t_stat_cell> ), COUNT(A2:A24)-2)
  • F statistic:
    =INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 1)
  • df (error):
    =INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 2)
  • p(F):
    =F.DIST.RT(INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 1), 1, INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 2))
What do the arguments mean?
  • known_y: your Y range (e.g., B2:B24)
  • known_x: your X range (e.g., A2:A24)
  • const=TRUE: fit an intercept (don’t force β₀ = 0)
  • stats=TRUE: return full 5×2 statistics array
How is the 5×2 array arranged?
Row Col 1 Col 2
1slope β₁intercept β₀
2SE(β₁)SE(β₀)
3Standard error of y|x (≈ STEYX)
4Fdf
5SS_reg (SSR)SS_resid (SSE)

Tip: In modern Excel you can avoid repeating LINEST using LET: =LET(L, LINEST(B2:B24, A2:A24, TRUE, TRUE), INDEX(L, 2, 1)) for SE(β₁).

Why this works (one sentence)

Excel’s least-squares line minimizes total squared vertical errors; if noise is roughly Normal with equal variance, it’s also the maximum likelihood line.

© ENGR200 • Single-file, offline. No external libs.