📈 Session 6.7 — Normal Probability Plot

What is p (plotting position)?

$p_i$ is the percentile you assign to the i-th smallest value $x_{(i)}$. It estimates the CDF at that point: $\;p_i \approx F\!\big(x_{(i)}\big)$. Then we compute the theoretical normal score $\;z_i = \Phi^{-1}(p_i)\;$ (same as Excel’s NORM.S.INV($p_i$)).

Not a p-value! Here $p$ is a percentile/position, not a hypothesis-test p-value.

Common formulas for $p_i$ (all acceptable — be consistent)

Hazen (Textbook): $\displaystyle p_i=\frac{i-0.5}{n}$
Blom / Rankit: $\displaystyle p_i=\frac{i-0.375}{\,n+0.25\,}$
Weibull: $\displaystyle p_i=\frac{i}{\,n+1\,}$
Benard (median-rank): $\displaystyle p_i=\frac{i-0.3}{\,n+0.4\,}$

This app lets you pick the formula; the CSV includes $x$, $p$, and $z$ so students can check the math.

1) Paste data

Plotting position:

2) Probability plot (z vs x)

Orientation: X = sorted data x(j), Y = z. Read the ends (outer ~3 points): both below → right-skew; both above → left-skew; left above & right below → heavy-tailed; left below & right above → light-tailed. The line here is a trimmed fit (middle 60%) so extremes don’t hide tail behavior.

3) Excel — quick build (X = grade, Y = z)

A) 1-Minute version (Excel 365 with spill)

Put grades in A2:A (one per cell).
B2 (sorted grades, spill): =SORT(FILTER(A2:A, A2:A<>""))
C2 (ranks j, spill): =SEQUENCE(COUNTA(B2#))
D2 (plotting position p_j, spill — Hazen): =(C2-0.5)/COUNTA(B2#)
Blom: =(C2-0.375)/(COUNTA(B2#)+0.25) • Weibull: =C2/(COUNTA(B2#)+1)
E2 (theoretical z, spill): =NORM.S.INV(D2#)
Make the chart → Scatter (Markers): X=B2#, Y=E2#
Reference line (two points): G2==MIN(B2#), G3==MAX(B2#); H2==(G2-AVERAGE(B2#))/STDEV.S(B2#); H3==(G3-AVERAGE(B2#))/STDEV.S(B2#).

B) Classic Excel (no spill)

Sort A2:A? ascending.
B1 n: =COUNT(A2:A1048576)
B2 j (fill to n+1): =ROW()-1
C2 p_j (Hazen): =(B2-0.5)/$B$1 (fill)
D2 z (fill): =NORM.S.INV(C2)
Chart: X=A2:A?, Y=D2:D?. Line: use min/max + mean/sd as above.

C) Read the plot

Points hug the line → roughly normal.
Right end bends up (above) → right-skew; left bend → left-skew.
Ends below→above → light tails; above→below → heavy tails.

Orientation here: X = sorted grade, Y = z.

Mini-gallery — Light vs Heavy vs Right-skewed vs Normal (with fitted lines)

Orientation: z vertical, x(j) horizontal. Light tails = ends below→above; Heavy tails = above→below; Right-skew = both below (Left-skew = both above). Lines are a trimmed fit (middle 60%) to avoid extremes pulling the fit.

(a) Light-tailed — ends below→above.

(b) Heavy-tailed — ends above→below.

(c) Right-skewed — both ends below.

(d) Normal — roughly linear.

4) Q & A

Why a probability plot?

It’s a small-sample friendly normality check; far more reliable than a histogram when n is small/medium.

Which axis is which?

This app (and your book) use z on the vertical, x(j) on the horizontal. Some tools flip axes—interpretation is identical.

What does the fitted line mean?

If the data are normal, points should align with a straight line. We fit the line to the middle 60% so tail patterns are visible.