📈 Session 6.7 — Normal Probability Plot

What is p (plotting position)?

\(p_i\) is the percentile you assign to the i-th smallest value \(x_{(i)}\). It estimates the CDF at that point: \(\;p_i \approx F\!\big(x_{(i)}\big)\). Then we compute the theoretical normal score \(\;z_i = \Phi^{-1}(p_i)\;\) (same as Excel’s NORM.S.INV(\(p_i\))).

Not a p-value! Here \(p\) is a percentile/position, not a hypothesis-test p-value.

Common formulas for \(p_i\) (all acceptable — be consistent)

This app lets you pick the formula; the CSV includes \(x\), \(p\), and \(z\) so students can check the math.

1) Paste data

2) Probability plot (z vs x)

Orientation: X = sorted data x(j), Y = z. Read the ends (outer ~3 points): both below → right-skew; both above → left-skew; left above & right below → heavy-tailed; left below & right above → light-tailed. The line here is a trimmed fit (middle 60%) so extremes don’t hide tail behavior.

3) Excel — quick build (X = grade, Y = z)

A) 1-Minute version (Excel 365 with spill)

  1. Put grades in A2:A (one per cell).
  2. B2 (sorted grades, spill): =SORT(FILTER(A2:A, A2:A<>""))
  3. C2 (ranks j, spill): =SEQUENCE(COUNTA(B2#))
  4. D2 (plotting position pj, spill — Hazen): =(C2-0.5)/COUNTA(B2#)
    Blom: =(C2-0.375)/(COUNTA(B2#)+0.25) • Weibull: =C2/(COUNTA(B2#)+1)
  5. E2 (theoretical z, spill): =NORM.S.INV(D2#)
  6. Make the chart → Scatter (Markers): X=B2#, Y=E2#
  7. Reference line (two points): G2==MIN(B2#), G3==MAX(B2#); H2==(G2-AVERAGE(B2#))/STDEV.S(B2#); H3==(G3-AVERAGE(B2#))/STDEV.S(B2#).

B) Classic Excel (no spill)

  1. Sort A2:A? ascending.
  2. B1 n: =COUNT(A2:A1048576)
  3. B2 j (fill to n+1): =ROW()-1
  4. C2 pj (Hazen): =(B2-0.5)/$B$1 (fill)
  5. D2 z (fill): =NORM.S.INV(C2)
  6. Chart: X=A2:A?, Y=D2:D?. Line: use min/max + mean/sd as above.

C) Read the plot

  • Points hug the line → roughly normal.
  • Right end bends up (above) → right-skew; left bend → left-skew.
  • Ends below→above → light tails; above→below → heavy tails.

Orientation here: X = sorted grade, Y = z.

Mini-gallery — Light vs Heavy vs Right-skewed vs Normal (with fitted lines)

Orientation: z vertical, x(j) horizontal. Light tails = ends below→above; Heavy tails = above→below; Right-skew = both below (Left-skew = both above). Lines are a trimmed fit (middle 60%) to avoid extremes pulling the fit.

(a) Light-tailed — ends below→above.

(b) Heavy-tailed — ends above→below.

(c) Right-skewed — both ends below.

(d) Normal — roughly linear.

4) Q & A

Why a probability plot?

It’s a small-sample friendly normality check; far more reliable than a histogram when n is small/medium.

Which axis is which?

This app (and your book) use z on the vertical, x(j) on the horizontal. Some tools flip axes—interpretation is identical.

What does the fitted line mean?

If the data are normal, points should align with a straight line. We fit the line to the middle 60% so tail patterns are visible.