Session 5.2 — Conditional Probability from a Joint Distribution
1) Joint PMF fXY(x,y) — Discrete Example
| y \ x | 1 bar | 2 bars | 3 bars | Row sum fY(y) | 
|---|---|---|---|---|
| 4 | 0.15 | 0.10 | 0.05 | 0.30 | 
| 3 | 0.02 | 0.10 | 0.05 | 0.17 | 
| 2 | 0.02 | 0.03 | 0.20 | 0.25 | 
| 1 | 0.01 | 0.02 | 0.25 | 0.28 | 
| Col sum fX(x) | 0.20 | 0.25 | 0.55 | 1.00 | 
      
      
      
      
      
      
      
      
      Mode: Joint f(x,y) • View: 2D Heatmap
    
    
    
    3-D library not available. The 2-D heatmap shows the same values. To enable 3-D offline, place plotly.min.js in this folder.
  2) Conditional PMFs (numbers)
P(Y|X=3), fX(3)=0.55
| y | fXY(3,y) | P(Y|X=3) | 
|---|---|---|
| 1 | 0.25 | 0.4545 | 
| 2 | 0.20 | 0.3636 | 
| 3 | 0.05 | 0.0909 | 
| 4 | 0.05 | 0.0909 | 
| Sum | = 1.0000 ✔ | |
P(X|Y=1), fY(1)=0.28
| x | fXY(x,1) | P(X|Y=1) | 
|---|---|---|
| 1 | 0.01 | 0.0357 | 
| 2 | 0.02 | 0.0714 | 
| 3 | 0.25 | 0.8929 | 
| Sum | = 1.0000 ✔ | |
3) Textbook Rules — Discrete Case (PMF & “CMF”/CDF)
Conditional Probability Mass Function
        \( \textbf{Conditional PMF:}\quad
           p_{Y|X}(y \mid x) \;=\; \dfrac{f_{XY}(x,y)}{f_X(x)},\quad \text{for } f_X(x) > 0 \)
      
      - Nonnegativity: \( \; p_{Y|X}(y \mid x) \ge 0 \)
- Normalization: \( \; \sum_{y} p_{Y|X}(y \mid x) = 1 \)
- Set probabilities: \( \; P(Y \in B \mid X=x) = \sum_{y \in B} p_{Y|X}(y \mid x) \)
Joint & Marginal PMFs
        \( f_X(x) = \sum_{y} f_{XY}(x,y), \qquad
           f_Y(y) = \sum_{x} f_{XY}(x,y) \)
\( \text{If independent: } f_{XY}(x,y) = f_X(x)\,f_Y(y) \)
    \( \text{If independent: } f_{XY}(x,y) = f_X(x)\,f_Y(y) \)
Joint CDF (a.k.a. CMF in discrete)
        \( F_{XY}(x,y) = P(X \le x,\; Y \le y)
           = \sum_{u \le x}\sum_{v \le y} f_{XY}(u,v) \)
\( F_X(x) = \sum_{u \le x} f_X(u), \qquad F_Y(y) = \sum_{v \le y} f_Y(v) \)
\( F_{Y|X}(y \mid x) = P(Y \le y \mid X=x) = \sum_{v \le y} p_{Y|X}(v \mid x) \)
\( \text{If independent: } F_{XY}(x,y) = F_X(x)\,F_Y(y) \)
    \( F_X(x) = \sum_{u \le x} f_X(u), \qquad F_Y(y) = \sum_{v \le y} f_Y(v) \)
\( F_{Y|X}(y \mid x) = P(Y \le y \mid X=x) = \sum_{v \le y} p_{Y|X}(v \mid x) \)
\( \text{If independent: } F_{XY}(x,y) = F_X(x)\,F_Y(y) \)
4) Conditional Probability Density Function — Continuous Case
        \( \textbf{Conditional pdf:}\quad
           f_{Y|X}(y \mid x) \;=\; \dfrac{f_{XY}(x,y)}{f_X(x)},\quad \text{for } f_X(x) > 0 \)
      
      1) \( f_{Y|X}(y \mid x) \ge 0 \)
        2) \( \displaystyle \int f_{Y|X}(y \mid x)\,dy = 1 \)
        3) \( P(Y \in B \mid X=x) = \displaystyle \int_{B} f_{Y|X}(y \mid x)\,dy \)
      5) Conditional Mean and Variance
          \( \textbf{Conditional mean of } Y \text{ given } X=x:\quad
             \mathbb{E}(Y \mid X=x) \;=\; \int y\, f_{Y|X}(y \mid x)\,dy \)
          
\( \textbf{Conditional variance of } Y \text{ given } X=x:\quad V(Y \mid X=x) \;=\; \int (y - \mu_{Y|X})^{2}\, f_{Y|X}(y \mid x)\,dy \;=\; \int y^{2} f_{Y|X}(y \mid x)\,dy \;-\; \mu_{Y|X}^{2} \)
where \( \mu_{Y|X} = \mathbb{E}(Y \mid X=x) \).
      \( \textbf{Conditional variance of } Y \text{ given } X=x:\quad V(Y \mid X=x) \;=\; \int (y - \mu_{Y|X})^{2}\, f_{Y|X}(y \mid x)\,dy \;=\; \int y^{2} f_{Y|X}(y \mid x)\,dy \;-\; \mu_{Y|X}^{2} \)
where \( \mu_{Y|X} = \mathbb{E}(Y \mid X=x) \).
6) Independence — Textbook YES Example & Checker
Definition & Equivalent Conditions
      \(X\) and \(Y\) are independent iff any (hence all) hold:
      
- \(f_{XY}(x,y)=f_X(x)\,f_Y(y)\) for all \(x,y\).
- \(p_{Y|X}(y\mid x)=p_Y(y)\) for all \(x,y\) with \(f_X(x)>0\) (discrete).
- \(p_{X|Y}(x\mid y)=p_X(x)\) for all \(x,y\) with \(f_Y(y)>0\) (discrete).
- \(P(X\in A,\;Y\in B)=P(X\in A)\,P(Y\in B)\) for all sets \(A,B\).
Why the independence check matters
- Simplifies probabilities: \(P(X\in A,\;Y\in B)=P(X\in A)\,P(Y\in B)\). One 2-D query becomes two 1-D queries.
- Factorizes the joint: \(f_{XY}(x,y)=f_X(x)f_Y(y)\), so you can work with the two marginals instead of a full table/surface.
- Conditionals collapse: \(p_{Y|X}(y\mid x)=p_Y(y)\) and \(p_{X|Y}(x\mid y)=p_X(x)\) — knowing one variable doesn’t change the other’s distribution.
- Expectation rules: for suitable \(g,h\), \( \mathbb{E}[g(X)h(Y)]=\mathbb{E}[g(X)]\,\mathbb{E}[h(Y)] \); in particular \( \mathbb{E}[XY]=\mathbb{E}[X]\mathbb{E}[Y] \).
- Variance additivity: \( \mathrm{Var}(X+Y)=\mathrm{Var}(X)+\mathrm{Var}(Y) \) (when independent), which is handy for error budgets.
- Caution: Independence is a strong assumption — one counterexample (any cell where \(f_{XY}\ne f_Xf_Y\)) is enough to reject it; a rectangular support is necessary but not sufficient.
Example 5.8 (Discrete, textbook) — Product form & identical conditionals
Marginals: \(f_X(0,1,2)=(0.75,0.20,0.05)\); \(f_Y(0,1,2,3)=(0.30,0.28,0.25,0.17)\). Joint = outer product \(f_Y\otimes f_X\).
Joint \(f_{XY}(x,y)\) with marginals
| y \ x | 0 | 1 | 2 | Row \(f_Y(y)\) | 
|---|---|---|---|---|
| 0 | 0.2250 | 0.0600 | 0.0150 | 0.30 | 
| 1 | 0.2100 | 0.0560 | 0.0140 | 0.28 | 
| 2 | 0.1875 | 0.0500 | 0.0125 | 0.25 | 
| 3 | 0.1275 | 0.0340 | 0.0085 | 0.17 | 
| Col \(f_X(x)\) | 0.75 | 0.20 | 0.05 | 1.00 | 
Click any joint cell to see
          \(f_{XY}(x,y)\) vs. \(f_X(x)f_Y(y)\) and \(p_{Y|X}(y\mid x)\) vs. \(f_Y(y)\).
      Conditionals \(p_{Y|X=x}(y)\) (same for each \(x\))
| y | \(p_{Y|X=0}(y)\) | \(p_{Y|X=1}(y)\) | \(p_{Y|X=2}(y)\) | = \(f_Y(y)\) | 
|---|---|---|---|---|
| 0 | 0.30 | 0.30 | 0.30 | 0.30 | 
| 1 | 0.28 | 0.28 | 0.28 | 0.28 | 
| 2 | 0.25 | 0.25 | 0.25 | 0.25 | 
| 3 | 0.17 | 0.17 | 0.17 | 0.17 | 
These equalities show \(p_{Y|X}(y\mid x)=f_Y(y)\) for all \(x\) ⇒ independence.