Session 8.3 — Confidence Interval for Variance (χ²)
Also reports the CI for the standard deviation \( \sigma \). Assumes approximately Normal data.
Why a CI for variance?
Sometimes the key question is the variability, not just the mean—for example, process spread in QC, calibration uncertainty, or risk buffers. Under Normality, the sample variance \( S^2 \) links to a chi-square:
Interpretation: if we repeat the whole process many times, about \(100(1-\alpha)\%\) of those intervals would cover the true variance (or SD).
Worked example (raw data or summary)
Enter raw data to compute \( s \) and \( n \), or use the summary boxes directly. Choose confidence and compute.
Enter raw data (comma/space/newline separated)
Excel — quick steps (two-sided CI)
- Inputs: \( s \) in B2, \( n \) in B3, ALPHA in B4 (e.g., 0.05 for 95%).
- df: =B3-1
- Left-tail χ² quantiles (preferred):
        - \( \chi^2_{\alpha/2,\nu} \): =CHISQ.INV(B4/2, B5)
- \( \chi^2_{1-\alpha/2,\nu} \): =CHISQ.INV(1-B4/2, B5)
 
- Variance bounds:
        - Lower \( \sigma^2_L \): =(B5*B2^2)/CHISQ.INV(1-B4/2,B5)
- Upper \( \sigma^2_U \): =(B5*B2^2)/CHISQ.INV(B4/2,B5)
 
- SD bounds: =SQRT(σ²_L) and =SQRT(σ²_U)
What changes the width?
- Higher confidence (99% vs 95%) → wider CI.
- Larger \( n \) → narrower CI.
- More variability \( s \) → wider CI.
Tips for students
- Report both: CI for \( \sigma^2 \) and for \( \sigma \) (square roots of bounds).
- Show df and the two χ² cutoffs you used.
- If Normality is doubtful at small \( n \), mention it and consider alternatives (e.g., bootstrapping \( s \)).
How to use t and χ² CIs in practice
Rule of thumb: use a t-interval when your question is about the mean (σ unknown). Use a χ² interval when your question is about the variability (σ or σ²).
Student Life — Commute time
Goal A (mean): “Is average one-way commute ≤ 30 minutes?” Sample: \( \bar x=31.4 \) min, \( s=8.0 \) min, \( n=10 \), 95%.
t-CI for mean (df=9):
\( \text{SE} = \frac{8}{\sqrt{10}} \approx 2.530,\quad t_{0.025,9}\approx 2.262 \)
\( \text{ME} = 2.262\times 2.530 \approx 5.72 \)
\( \mu \in [31.4-5.72,\ 31.4+5.72] = [25.68,\ 37.12] \) minutes
Decision: 30 is inside the interval → we can’t claim the mean is definitively ≤ 30. Gather more data or accept uncertainty.
Goal B (variability): “Is commute spread stable (σ ≤ 6 min)?” Same sample \( s=8.0,\ n=10 \), 95%.
χ²-CI for σ (df=9; using left-tail χ²): \( \chi^2_{0.025,9}\approx 2.700,\ \chi^2_{0.975,9}\approx 19.02 \)
Variance CI: \( \big[\,\frac{9\cdot 8^2}{19.02},\ \frac{9\cdot 8^2}{2.700}\big] \approx [30.3,\ 213.3] \)
σ-CI: \( [\sqrt{30.3},\ \sqrt{213.3}] \approx [5.5,\ 14.6] \) minutes
Decision: Since the σ-CI includes values > 6, we cannot claim the commute variability meets the ≤6-minute stability goal. Try larger \( n \) or interventions that reduce variance.
Engineering QC — Machined diameter
Spec: Target 42.00 mm, tolerance ±0.10 mm; desire process σ ≤ 0.10 mm. Sample: \( \bar x=41.95 \) mm, \( s=0.14 \) mm, \( n=20 \), 95%.
t-CI for mean (df=19): \( \text{SE} = \frac{0.14}{\sqrt{20}} \approx 0.0313,\ t_{0.025,19}\approx 2.093 \)
\( \text{ME} \approx 2.093 \times 0.0313 \approx 0.0655 \)
\( \mu \in [41.8845,\ 42.0155] \) mm
Interpretation: Mean is close to target; the CI barely dips below 41.90 → watch for slight under-sizing.
χ²-CI for σ (df=19): \( \chi^2_{0.025,19}\approx 8.907,\ \chi^2_{0.975,19}\approx 32.852 \)
Var CI: \( \big[\,\frac{19\cdot 0.14^2}{32.852},\ \frac{19\cdot 0.14^2}{8.907}\big] \approx [0.0113,\ 0.0418] \)
σ-CI: \( [0.106,\ 0.205] \) mm
Decision: Even the best-case bound (~0.106) exceeds 0.10 → process variability likely fails spec. Reduce sources of variation (tool wear, fixturing, calibration) before ramping.
Choosing the right interval
- Question about a mean? Use t-interval (σ unknown). Show \( \bar x, s, n, \text{df}, t \), CI, and your decision in words tied to the practical threshold.
- Question about variability (risk/spread)? Use χ²-interval for \( \sigma^2 \) or \( \sigma \). Show df and the two χ² cutoffs you used.
- Both matter (common in QC & SLAs): compute both CIs and make a joint decision (center and spread).
Biases & pitfalls (and how to handle them)
- Sampling bias: Only morning commutes? Only one operator/shift? Fix: randomize times, use stratified sampling (rush vs off-peak; operator A/B/C), ensure coverage.
- Non-independence: Back-to-back parts or same route/day cause autocorrelation. Fix: sample across days/batches; space out runs; mix machines/fixtures.
- Non-Normal data: χ² CI assumes Normality for σ. Heavy skew/outliers (e.g., accidents) inflate \( s \). Fix: larger \( n \); consider transforming (log-minutes) or bootstrapping σ for a robustness check (state this in your write-up).
- Measurement error: Phone GPS lag; calipers out of calibration. Fix: calibrate tools; repeat-measure a standard; average repeated reads.
- Underpowered studies: Wide CIs lead to “no decision.” Fix: increase \( n \) using your margin-planning formulas (z first, then refine with t).
- Cherry-picking: Don’t drop “bad days” or “bad parts.” Fix: pre-register your inclusion rules; report sensitivity with/without outliers.
- Shifting process: Tool wear or seasonal traffic changes mean data aren’t i.i.d. Fix: segment by time; monitor drift; re-estimate regularly.
Career connections
- Operations / SLAs: Use t-CIs to show customers “95% CI for average response < target,” and χ²-CIs to guarantee stability of response-time variance.
- Manufacturing QA: Pair t-CI (centering) with χ²-CI (spread) before releasing lots; tie σ-CI to capability (Cp/Cpk) narratives.
- Healthcare / Service: Wait-time improvement pilots: t-CI for mean wait; χ²-CI to ensure fewer extreme delays.
- Finance / Risk: t-CI for average P&L uplift; χ²-CI for volatility bands when presenting strategies to PMs/committees.
Student reporting checklist
- Question & threshold: State the business/engineering question in words.
- Inputs: \( \bar x, s, n \) (or raw data path), chosen confidence, assumptions.
- Show work: df, critical values (t or χ²), SE/ME, and the CI with units.
- Decision in words: Tie CI to the threshold (e.g., “cannot claim ≤ 30 min”).
- Bias check: Note sampling and Normality issues; propose one improvement.