📘 Session 2.8 - Bayes’ Theorem

Understanding Bayes' Theorem

Bayes’ Theorem helps us reverse conditional probabilities. Instead of knowing P(A | B), we use Bayes’ Theorem to compute it from P(B | A).

P(A | B) = [P(B | A) × P(A)] / P(B)

We often use this when we know how often a result happens given a condition (like test results), and we want to know how likely the condition is after seeing the result.

Bayes’ Theorem — Derivation and Tree (Pepper Example)

P(A∩B) = P(A)P(B|A) = P(B)P(A|B)
⇒ P(A|B) = [P(B|A)·P(A)] / P(B)
P(B) = P(B|A)P(A) + P(B|¬A)P(¬A)

Let A = Foreign, B = Likes Pepper. Use: P(A)=0.20, P(B|A)=0.80, P(B|¬A)=0.30.

	Likes Pepper (B)	Not Like Pepper (¬B)	Total
Foreign (A)	0.16	0.04	0.20
USA (¬A)	0.24	0.56	0.80
Total	0.40	0.60	1.00

Results among pepper-lovers (B):
P(Foreign | B) = 0.16 / 0.40 = 0.40 = 40%
P(USA | B) = 0.24 / 0.40 = 0.60 = 60%

General Form of Bayes' Theorem (Multiple Events)

If events E₁, E₂, ..., Eₖ are mutually exclusive and exhaustive, and B is any event, then:

This form is used when there are several possible causes for one outcome.

🧪 Example 2.26 – Contamination Problem

A semiconductor fails. What is the probability that high contamination was present?

P(F | H) = 0.10 (failure given high contamination)
P(F | H′) = 0.005 (failure given no high contamination)
P(H) = 0.20, P(H′) = 0.80

Step 1 – Overall failure rate
P(F) = P(F|H)P(H) + P(F|H′)P(H′) = (0.10)(0.20) + (0.005)(0.80) = 0.020 + 0.004 = 0.024.

Common mistake: P(F) ≠ 0.10 + 0.005.
Those are conditional rates; weight by prevalence: P(F) = 0.10×0.20 + 0.005×0.80 = 0.024. Sanity check: 0.024 lies between 0.005 and 0.10.

Step 2 – Bayes’ Theorem (invert to get P(H|F))
P(H | F) = P(F | H) · P(H)P(F) = 0.10 · 0.200.024 = 0.0200.024 = 2024 = 5/6 ≈ 0.8333

Interpretation: Among failed chips, about 83% were produced under high contamination. It’s not 100% because some failures also occur when contamination isn’t high.

Frequency picture (per 1000 chips)

Condition	Chips	Fail	Pass
High contamination (H)	200	20	180
Not high (H′)	800	4	796
Totals	1000	24	976

Given a failure, P(H|F) = 20/24 = 5/6.

Likelihood-ratio / odds view (why 83% is so high)

Prior odds for H: P(H)/P(H′) = 0.20/0.80 = 1/4.
Likelihood ratio: LR = P(F|H) / P(F|H′) = 0.10 / 0.005 = 20.
Posterior odds: prior × LR = (1/4) × 20 = 5.
Convert to probability: 5 / (1 + 5) = 5/6 = 0.8333….

Intuition: a failure is 20× more likely under high contamination. That strong evidence flips prior odds of 1:4 into posterior odds of 5:1.

Avoid premature rounding—carry at least 3–4 decimals until the end. Here, 0.020/0.024 = 5/6 exactly.

🏥 Example 2.27 – Medical Diagnostic

A new test has:

Sensitivity = P(Positive | Sick) = 0.99
Specificity = P(Negative | Healthy) = 0.95 → so P(Positive | Healthy) = 0.05
P(Sick) = 0.0001 (rare disease), P(Healthy) = 0.9999

You test positive. What is P(Sick | Positive)?

Step 1 – Write Bayes’ Theorem
P(Sick | Positive) = P(Positive | Sick) × P(Sick)P(Positive)

Step 2 – Expand denominator with law of total probability
P(Positive) = P(Positive | Sick)P(Sick) + P(Positive | Healthy)P(Healthy).

Step 3 – Plug in numbers
Numerator = (0.99)(0.0001) = 0.000099
Denominator = (0.99)(0.0001) + (0.05)(0.9999)
= 0.000099 + 0.049995 ≈ 0.050094

P(Sick | Positive) = 0.000099 / 0.050094 ≈ 0.00198 ≈ 0.2%.

Step 4 – Interpret in plain words
Although the test is very sensitive (99%) and fairly specific (95%), the disease is extremely rare (0.01%).
Out of all positives, most will actually be false positives. So if you test positive, the chance you really have the disease is only about 1 in 506.

Frequency picture (per 1,000,000 people)

Condition	Population	Positive	Negative
Sick	100	99	1
Healthy	999,900	49,995	949,905
Totals	1,000,000	50,094	949,906

Among 50,094 positives, only 99 are real cases → 99 / 50,094 ≈ 0.2%.

Takeaway

Even good tests can give misleading results when the disease is rare.
This is called the base-rate fallacy: people ignore how small the prior probability is.
Always combine test accuracy and disease prevalence when interpreting results.

Bayes’ Theorem — Dorms & Gender (ICE 9/8/2025)

Question. Students live in Dorm A (30%), Dorm B (50%), or Dorm C (20%). Female proportions by dorm: A = 90% female, B = 30% female, C = 50% female. If a randomly selected student is known to be female, find:

P(Dorm A | Female)
P(Dorm B | Female)
P(Dorm C | Female)

Show solution

Formula to use (Bayes + Total Probability)

Total female probability: P(F) = P(F | A)×P(A) + P(F | B)×P(B) + P(F | C)×P(C)

Use decimals: 30% → 0.30, 90% → 0.90, etc.

Step 1 — Given

P(A) = 0.30, P(B) = 0.50, P(C) = 0.20
P(F | A) = 0.90, P(F | B) = 0.30, P(F | C) = 0.50

Step 2 — Compute the denominator P(F) (Total Probability)

P(F) = (0.90×0.30) + (0.30×0.50) + (0.50×0.20) = 0.27 + 0.15 + 0.10 = 0.52

Dorm	P(Dorm)	P(Female \| Dorm)	P(Female ∧ Dorm) = product
A	0.30	0.90	0.90×0.30 = 0.27
B	0.50	0.30	0.30×0.50 = 0.15
C	0.20	0.50	0.50×0.20 = 0.10
Sum = P(F)			0.27 + 0.15 + 0.10 = 0.52

Step 3 — Plug into Bayes’ Theorem (one line each)

P(A | F) = (0.90×0.30) / 0.52 = 0.27 / 0.52 = 27/52 ≈ 0.5192 (51.92%)

P(B | F) = (0.30×0.50) / 0.52 = 0.15 / 0.52 = 15/52 ≈ 0.2885 (28.85%)

P(C | F) = (0.50×0.20) / 0.52 = 0.10 / 0.52 = 10/52 = 5/26 ≈ 0.1923 (19.23%)

Check: 0.5192 + 0.2885 + 0.1923 = 1.0000

Show Why the Denominator Works

The denominator is the probability of being female overall, across all dorms. Each term is the chance of being in a dorm and being female from that dorm: P(F ∧ A) + P(F ∧ B) + P(F ∧ C). This is the Law of Total Probability.

Step 4 — Probability Tree (visual check)

🧠 Try This Practice Problem

A factory uses two machines. Machine A produces 40% of items with a 2% defect rate. Machine B produces 60% of items with a 5% defect rate. An item is found to be defective. What’s the probability it came from Machine B?

Let D = defective, A = machine A, B = machine B.
P(D | B) = 0.05, P(B) = 0.6
P(D | A) = 0.02, P(A) = 0.4
P(D) = 0.05 × 0.6 + 0.02 × 0.4 = 0.03 + 0.008 = 0.038
P(B | D) = P(D | B) × P(B) / P(D) = 0.03 / 0.038 ≈ 0.789

🔍 Session 2.8 - Bayes’ Theorem

Understanding Bayes' Theorem

Bayes’ Theorem — Derivation and Tree (Pepper Example)

General Form of Bayes' Theorem (Multiple Events)

🧪 Example 2.26 – Contamination Problem

🏥 Example 2.27 – Medical Diagnostic

Bayes’ Theorem — Dorms & Gender (ICE 9/8/2025)

Step 1 — Given

Step 2 — Compute the denominator P(F) (Total Probability)

Step 3 — Plug into Bayes’ Theorem (one line each)

Step 4 — Probability Tree (visual check)

🧠 Try This Practice Problem

📢 Student Q&A (Bayes’ Theorem)

Q1: Why isn’t P(H | F) the same as P(F | H)?

Q2: Where does P(F) in the denominator come from?

Q3: What’s the “base-rate fallacy” in Bayes problems?

Q4: How do I handle multiple possible causes (H₁, H₂, …, Hₖ)?

Q5: Any quick intuition for P(H | F) = 5/6 in the contamination example?