Chapter 11 – ICE — Excel-First Regression (No Math)

Prefer the one-click Excel summary? 👉 Use the ToolPak version.

Do the whole simple linear regression in Excel: compute the line, check fit, make a forecast. App also shows t-stats, p-values, and ANOVA F + a live guide for judging model quality. (Derivations are on the Math ICE.)

Intro — What problem are we solving?

Regression finds a straight-line rule y = b0 + b1·x to predict Y from X. Example: “If a student studies more hours, how much might their exam score increase?” This page stays Excel-first: functions, a trendline, and a forecast at your chosen x₀. The Math ICE shows why OLS = MLE under Normal noise.

Step A — Data (X, Y)

Type or paste below, or click “Generate data.” Columns: X, Y.

#	X	Y

Tip: After copying to Excel, put X in A2:AN and Y in B2:BN. Paste by single-clicking A1 → Ctrl+V. If needed: Data → Text to Columns → Delimited → Tab.

Step B — Excel functions you’ll use

Copy with the equals sign so results calculate immediately.

Core formulas (copy me ⤵)

Slope (b1):

Intercept (b0):

R²:

Correlation (r):

Std error of y|x (STEYX):

Predicted y at x = :

Residuals (if b0 in E2, b1 in E3):

Full stats via LINEST (enter as dynamic array in a 5×2 block):

Step C — Visual check (should match Excel)

✅ Data ready ✅ SLOPE & INTERCEPT ✅ RSQ & CORREL ✅ Forecast at x₀ ✅ Residuals

Step D — How to do this in Excel (student checklist)

Paste data: X in A2:AN, Y in B2:BN.
Compute =SLOPE, =INTERCEPT, =RSQ, =CORREL, =STEYX, =FORECAST.LINEAR.
Insert → Scatter; add Trendline and display equation & R².
Add residuals: =B2 - ($E$2 + $E$3*A2). Check curve/outliers.
Interpret: slope units, intercept meaning, R² strength, typical error size, and whether x₀ is inside the X-range.

Judge fit: (i) R² high, (ii) STEYX small (Y-units), (iii) p(b₁) small, (iv) p(F) small, (v) residuals look random (no curve/funnel/outliers).

Step E — Plain-English explanation of your results

Step F — Significance & ANOVA

t & p for slope b₁

t = b₁ / SE(b₁), df = n−2

t(b₁): –

p(two-tailed): –

Excel (copy-paste)

t(b₁):

df:

p(two-tailed):

t & p for intercept b₀

t = b₀ / SE(b₀)

t(b₀): –

p(two-tailed): –

Excel (copy-paste)

t(b₀):

p(two-tailed):

ANOVA — overall model

F = MSR / MSE; df₁=1, df₂=n−2

F: –

p(F): –

Excel (copy-paste)

p(F):

Source	SS	df	MS	F	p
Regression	–	1	–	–	–
Error	–	–	–	–	–
Total	–	–	–	–	–

How to read this table: SST is total variation in Y (about its mean). SSR is the part explained by the line. SSE is the leftover error. Divide SS by df to get MS. The F test compares explained vs leftover variation (MSR/MSE). A small p(F) means the line captures real signal (reject H₀: β₁=0). In simple regression, F = t(b₁)².

Step G — How to tell if the model is “good” (live with your numbers)

Hypotheses & Excel formulas (students can copy)

Slope test (t) — H₀: β₁=0, H₁: β₁≠0 (two-tailed).
t = b1 / SE(b1), df = n−2, p = T.DIST.2T(ABS(t), df)

With LINEST:
t(b1):
df:
p(two-tailed):

Overall model (F) — H₀: “no linear signal” (β₁=0).
F = MSR/MSE, p(F)=F.DIST.RT(F,1,n−2)
F:
p(F):

SS pieces in Excel: SST=DEVSQ(Y), SSR=RSQ(Y,X)*DEVSQ(Y), SSE=(n−2)*STEYX(Y,X)^2

Step H — Python (single script you can run or share)

Run: python ch11_regression_report.py. Optional: --csv your.csv --xcol X --ycol Y --x0 10.

Write-up template (auto-filled)

Q&A — Getting SE(β₀), SE(β₁) with LINEST

Q: How do I get the standard error of the slope or intercept individually in Excel?

A: Use LINEST(known_y, known_x, const, stats) with the last two arguments set to TRUE so Excel returns the full 5×2 stats array, then pick out the piece you want with INDEX.

SE(β₁) (slope):
=INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 2, 1)
SE(β₀) (intercept):
=INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 2, 2)
t(β₁):
=INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 1, 1) / INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 2, 1)
p(two-tailed) for slope:
=T.DIST.2T(ABS( <t_stat_cell> ), COUNT(A2:A24)-2)
F statistic:
=INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 1)
df (error):
=INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 2)
p(F):
=F.DIST.RT(INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 1), 1, INDEX(LINEST(B2:B24, A2:A24, TRUE, TRUE), 4, 2))

What do the arguments mean?

known_y: your Y range (e.g., B2:B24)
known_x: your X range (e.g., A2:A24)
const=TRUE: fit an intercept (don’t force β₀ = 0)
stats=TRUE: return full 5×2 statistics array

How is the 5×2 array arranged?

Row	Col 1	Col 2
1	slope β₁	intercept β₀
2	SE(β₁)	SE(β₀)
3	R²	Standard error of y\|x (≈ STEYX)
4	F	df
5	SS_reg (SSR)	SS_resid (SSE)

Tip: In modern Excel you can avoid repeating LINEST using LET: =LET(L, LINEST(B2:B24, A2:A24, TRUE, TRUE), INDEX(L, 2, 1)) for SE(β₁).

Why this works (one sentence)

Excel’s least-squares line minimizes total squared vertical errors; if noise is roughly Normal with equal variance, it’s also the maximum likelihood line.