📈 Session 6.7 — Normal Probability Plot
What is p (plotting position)?
\(p_i\) is the percentile you assign to the
i-th smallest value \(x_{(i)}\).
It estimates the CDF at that point:
\(\;p_i \approx F\!\big(x_{(i)}\big)\).
Then we compute the theoretical normal score
\(\;z_i = \Phi^{-1}(p_i)\;\) (same as Excel’s NORM.S.INV(\(p_i\))).
Not a p-value! Here \(p\) is a percentile/position, not a hypothesis-test p-value.
Common formulas for \(p_i\) (all acceptable — be consistent)
- Hazen (Textbook): \(\displaystyle p_i=\frac{i-0.5}{n}\)
- Blom / Rankit: \(\displaystyle p_i=\frac{i-0.375}{\,n+0.25\,}\)
- Weibull: \(\displaystyle p_i=\frac{i}{\,n+1\,}\)
- Benard (median-rank): \(\displaystyle p_i=\frac{i-0.3}{\,n+0.4\,}\)
This app lets you pick the formula; the CSV includes \(x\), \(p\), and \(z\) so students can check the math.
1) Paste data
2) Probability plot (z vs x)
Orientation: X = sorted data x(j), Y = z. Read the ends (outer ~3 points): both below → right-skew; both above → left-skew; left above & right below → heavy-tailed; left below & right above → light-tailed. The line here is a trimmed fit (middle 60%) so extremes don’t hide tail behavior.
3) Excel — quick build (X = grade, Y = z)
A) 1-Minute version (Excel 365 with spill)
- Put grades in A2:A (one per cell).
- B2 (sorted grades, spill):
=SORT(FILTER(A2:A, A2:A<>"")) - C2 (ranks j, spill):
=SEQUENCE(COUNTA(B2#)) - D2 (plotting position pj, spill — Hazen):
=(C2-0.5)/COUNTA(B2#)Blom:=(C2-0.375)/(COUNTA(B2#)+0.25)• Weibull:=C2/(COUNTA(B2#)+1) - E2 (theoretical z, spill):
=NORM.S.INV(D2#) - Make the chart → Scatter (Markers): X=
B2#, Y=E2# - Reference line (two points): G2=
=MIN(B2#), G3==MAX(B2#); H2==(G2-AVERAGE(B2#))/STDEV.S(B2#); H3==(G3-AVERAGE(B2#))/STDEV.S(B2#).
B) Classic Excel (no spill)
- Sort A2:A? ascending.
- B1 n:
=COUNT(A2:A1048576) - B2 j (fill to n+1):
=ROW()-1 - C2 pj (Hazen):
=(B2-0.5)/$B$1(fill) - D2 z (fill):
=NORM.S.INV(C2) - Chart: X=
A2:A?, Y=D2:D?. Line: use min/max + mean/sd as above.
C) Read the plot
- Points hug the line → roughly normal.
- Right end bends up (above) → right-skew; left bend → left-skew.
- Ends below→above → light tails; above→below → heavy tails.
Orientation here: X = sorted grade, Y = z.
Mini-gallery — Light vs Heavy vs Right-skewed vs Normal (with fitted lines)
Orientation: z vertical, x(j) horizontal. Light tails = ends below→above; Heavy tails = above→below; Right-skew = both below (Left-skew = both above). Lines are a trimmed fit (middle 60%) to avoid extremes pulling the fit.
(a) Light-tailed — ends below→above.
(b) Heavy-tailed — ends above→below.
(c) Right-skewed — both ends below.
(d) Normal — roughly linear.
4) Q & A
Why a probability plot?
It’s a small-sample friendly normality check; far more reliable than a histogram when n is small/medium.
Which axis is which?
This app (and your book) use z on the vertical, x(j) on the horizontal. Some tools flip axes—interpretation is identical.
What does the fitted line mean?
If the data are normal, points should align with a straight line. We fit the line to the middle 60% so tail patterns are visible.