📦 Session 6.4 — Boxplot (with fences & min/max)

What it shows • Why • When

Why/When: Boxplots are great for comparing groups, summarizing big lists with a 5-number summary, and spotting skew/outliers. They’re robust and don’t assume symmetry.

Legend (consistent colors)

Boxplot of Alloy Compressive Strength (blue dashed = fences, purple dotted = min/max)

Shape examples (toy data)

Left-skewed: long tail to smaller values; median closer to Q3.
Symmetric: whiskers similar; median near box center.
Right-skewed: long tail to larger values; median closer to Q1.

Mini example — do it by hand

Data:


    

Excel (2016+): Insert → Box & Whisker

  1. Put the values in a single column: e.g., A2:A9 = 4,5,7,8,9,10,16,20.
  2. Select just that range.
  3. Insert → Statistic Chart → Box & Whisker.
  4. Right‑click the chart → Format Chart Area (or axis). Confirm:
    • Excel uses the Tukey 1.5×IQR rule. With these data, the point 20 will appear as an outlier.
    • Quartiles match our list when Excel uses the inclusive median (QUARTILE.INC).
  5. Add chart elements if you want: Axis Title, Data Labels, etc.

Tip: If you don’t see “Box & Whisker”, you can create a quick one in Google Sheets (Insert → Chart → Chart type: Box and whisker), or compute the five numbers with QUARTILE.INC and draw bars manually.

🙋 Student Q&A

Why use a boxplot (not a histogram)? When?

Boxplots compress a dataset to 5 numbers plus outliers—perfect for side-by-side comparisons of groups and quick checks of skew/outliers. Use histograms when you want the full shape within bins and have many data points.

How do I read skewness from a boxplot?

Median near Q1 + long upper whisker ⇒ right-skewed. Median near Q3 + long lower whisker ⇒ left-skewed. Balanced ⇒ symmetric.

Are whiskers the same as min and max?

No. Whiskers stop at the most extreme non-outlier data points: lower whisker = smallest value ≥ lower fence; upper whisker = largest value ≤ upper fence. If there are no outliers, whiskers do equal min/max.

Are fences the same as whiskers?

No. Blue dashed fences are cutoffs computed from quartiles: Lower fence = Q1 − 1.5·IQR; Upper fence = Q3 + 1.5·IQR. Whiskers are drawn at the most extreme data points still inside those cutoffs.

What are the purple dotted lines?

They mark the dataset’s Min and Max. If a min/max is beyond a fence, you’ll see a red outlier dot at that value.

Which center is best if skewed?

The median. The mean gets pulled by the tail/outliers.

What does box length mean?

It’s IQR (Q3 − Q1): the spread of the middle 50% of the data. Bigger box ⇒ more variability.

If I remove outliers, what happens?

Whiskers usually move to the new min/max (now inside fences), and Q1/Q3 might shift a bit.

Why draw fences? Many apps don’t show them.

Most tools use them but hide the lines. We draw the blue dashed fences for teaching so you can see why a point is an outlier and where whiskers stop.

🙋 Student Q&A (short)

Q: Why use a boxplot, not a histogram?
A: Boxplots summarize with just five numbers & outliers, and are great for comparing groups side‑by‑side.

Q: Which center is best if skewed?
A: The median. The mean is dragged by the tail/outliers.

Q: What does a long right whisker mean?
A: Right‑skewed: more spread/tail toward large values.

Q: Are outliers “bad data”?
A: Not always. They might be real extremes or data entry errors—investigate!