Section 5.1 — Joint & Marginal Distributions
All formulas are rendered with standard fonts (no external libraries), so they display correctly even when opened locally.
Joint Probability Mass Function (Discrete)
(2) Normalization: ∑x ∑y fXY(x, y) = 1
(3) Event probability: fXY(x, y) = P(X = x, Y = y)
1) Joint Distributions for Two Random Variables
fXY(x, y) ≥ 0, ∬ℝ² fXY(x, y) dx dy = 1.
For a region R ⊆ ℝ²,
P{ (X, Y) ∈ R } = ∬R fXY(x, y) dx dy.
Joint CDF:
FXY(x, y) = P{ X ≤ x, Y ≤ y } = ∫−∞x ∫−∞y fXY(u, v) dv du.
Interpretation: the joint PMF assigns probabilities to grid points (x, y); the joint PDF is a surface whose volume over a region equals the probability that (X, Y) falls in that region.
1.1) Class Survey Example — Lotte Market Visit (23 Students)
A simple real dataset: 23 JU students visited Lotte Market (Jacksonville). Each student recorded (1) minutes spent and (2) money spent ($). We’ll treat these as joint random variables:
- X = visit duration (min): {5–10, 10–30, 30–60}
- Y = money spent ($): {under 10, 10–30, 30–80}
Observed counts (out of 23 total):
| Y \\ X | 5–10 min | 10–30 min | 30–60 min | Total fY(y) | 
|---|---|---|---|---|
| Under $10 | 4 | 2 | 1 | 7 | 
| $10–30 | 2 | 5 | 3 | 10 | 
| $30–80 | 0 | 2 | 4 | 6 | 
| Total fX(x) | 6 | 9 | 8 | 23 | 
Convert to probabilities by dividing each count by 23.
Marginal fX(10–30) = (2+5+2)/23 = 9/23 ≈ 0.391.
Marginal fY($10–30) = (2+5+3)/23 = 10/23 ≈ 0.435.
Interactive 3D Joint PMF Visualization
Drag to rotate. Each bar’s height is the joint probability f(X,Y). The total volume under all bars equals 1. (Shows that joint probability is now a 3D surface over (X,Y) categories.)
1.2) Marginals & Conditional Probability — Lotte Market (n=23)
Marginal of X (minutes)
fX(x) = ∑y fXY(x,y)
Marginal of Y ($ spent)
fY(y) = ∑x fXY(x,y)
Table (rows = Y, cols = X):
[[4,2,1],[2,5,3],[0,2,4]] with total 23.
  Conditional Probability Calculator
Reminder: For a row/column in the joint table, conditional probability is “cell ÷ that row/column total.” Example: P(Y=10–30 | X=10–30) = 5/9 ≈ 0.556, since the X=10–30 column has 9 students total.
1.3) Quick Real-World Examples of Joint Variables (no math)
Use these to recognize when you have a joint distribution. They say what X and Y are, what values they can take (the “support”), and why they’re modeled together.
Discrete–Discrete
- Website traffic: X = # visits in an hour, Y = # checkouts in that hour. (Support: nonnegative integers.) Counts tend to move together.
- Call center: X = # incoming calls 9–10am, Y = # dropped calls 9–10am. (Integers.) Joint because more calls can mean more drops.
- Dice game: X = die 1 outcome, Y = die 2 outcome. (1–6 each.) Classic independent case.
- Quality control: X = defects per unit A, Y = defects per unit B from same line. (Integers.) Dependence via shared conditions.
Continuous–Continuous
- Network latencies: X = connect time (ms), Y = auth time (ms). (Positive reals.) Often Y ≥ X or both positive; can be dependent.
- Finance intraday: X = stock A return (%, 1-min), Y = stock B return (%, 1-min). (Real numbers.) Correlated due to market factors.
- Manufacturing: X = part length (mm), Y = part width (mm). (Positive reals within tolerances.) Joint for yield estimation.
- Weather: X = temperature (°F), Y = humidity (%). (Ranges like 0–120 and 0–100.) Nonlinear relationship common.
- Reliability: X = time to first failure, Y = time to full replacement. (Positive.) Natural constraint X < Y.
- Medical: X = systolic BP, Y = diastolic BP. (Realistic ranges.) Strong biological dependence.
Mixed (one discrete, one continuous)
- E-commerce: X = number of items in cart (integer), Y = total spend ($, continuous). Larger X tends to increase Y.
- Queues: X = number waiting (integer), Y = individual wait time (minutes). More people, longer waits.
- Education: X = # study sessions this week (integer), Y = exam score (0–100). Behavior–outcome linkage.
Supports you’ll see (described, not computed)
- Rectangle: both variables free in ranges (e.g., length 18–22mm, width 8–12mm). Often independence is assumed then checked.
- Triangle “X < Y”: ordering or sequence times (start vs finish, connect vs authorize). Only points with X less than Y are allowed.
- Band/curve constraints: physical laws or policies tie values (e.g., speed vs fuel use within safe operating band).
Why joint?
- Plan probabilities about both together: “What’s the chance wait < 5 min and queue size ≤ 3?”
- Summarize each alone: get the marginals for X or Y (e.g., distribution of total spend regardless of cart size).
- Condition on one: “Given 10 callers are waiting, what is expected wait?” (decision support, SLAs, staffing).
- Test independence/correlation: do X and Y move together or not? (risk, hedging, causality hints).
Rule of thumb: If your question mentions both X and Y in the same event or decision, you’re in joint-distribution land.
2) Marginal Distributions
Continuous: fX(x) = ∫−∞∞ fXY(x, y) dy, fY(y) = ∫−∞∞ fXY(x, y) dx.
Marginal CDFs: FX(x) = ∫−∞x fX(u) du, FY(y) = ∫−∞y fY(v) dv.
If X and Y are independent, then fXY(x, y) = fX(x) fY(y) and FXY(x, y) = FX(x) FY(y).
2.1) ✅ TL;DR Summary — Symbols & Integration Variables
| Symbol | Meaning | Integration variable | 
|---|---|---|
| fXY(x, y) | Joint PDF | variables x, y | 
| fX(x) | Marginal PDF of X | integrated over y | 
| fY(y) | Marginal PDF of Y | integrated over x | 
| FX(x) | CDF of X | integrate fX(u) with respect to u up to x | 
| FY(y) | CDF of Y | integrate fY(v) with respect to v up to y | 
2.2) Mean & Variance from a Joint Distribution
E[X] = ∫−∞∞ x fX(x) dx, E[Y] = ∫−∞∞ y fY(y) dy.
Var(X) = ∫−∞∞ (x−μX)2 fX(x) dx = E[X2] − μX2, where μX=E[X].
Var(Y) = ∫−∞∞ (y−μY)2 fY(y) dy = E[Y2] − μY2.
E[X] = &iint; x fX,Y(x,y) dx dy, E[Y] = &iint; y fX,Y(x,y) dx dy.
E[X2] = &iint; x2 fX,Y(x,y) dx dy, E[Y2] = &iint; y2 fX,Y(x,y) dx dy.
2.3) Covariance & Correlation (optional)
Quick worked values for our examples
fX(1)=0.20, fX(2)=0.25, fX(3)=0.55.
fY(1)=0.28, fY(2)=0.25, fY(3)=0.17, fY(4)=0.30.
E[X] = 1·0.20 + 2·0.25 + 3·0.55 = 2.35.
E[Y] = 1·0.28 + 2·0.25 + 3·0.17 + 4·0.30 = 2.49.
Marginals: fX(x)=2e−2x (x>0), fY(y)=2e−y−2e−2y (y>0).
E[X] = 0.5, Var(X) = 0.25.
E[Y] = 1.5, Var(Y) = 1.25.
3) Discrete Illustration (Joint PMF Table)
A joint PMF can be shown in a table; row/column sums give the marginals.
| y \\ x | 1 | 2 | 3 | Marginal fY(y) | 
|---|---|---|---|---|
| 1 | 0.01 | 0.02 | 0.25 | 0.28 | 
| 2 | 0.02 | 0.03 | 0.20 | 0.25 | 
| 3 | 0.02 | 0.10 | 0.05 | 0.17 | 
| 4 | 0.15 | 0.10 | 0.05 | 0.30 | 
| Marginal fX(x) | 0.20 | 0.25 | 0.55 | 1.00 | 
How to read this table (very clear):
What are X and Y?
          X = number of requests (1–3). Y = response-time category (1–4).
Cell value: fXY(x,y) = P(X=x, Y=y).
Examples: P(3,1)=0.25; P(2,3)=0.10; P(1,4)=0.15; P(3,2)=0.20.
Total = 1.00.
What are the marginals?
fX(x): add down the column. fY(y): add across the row.
Example: fX(2)=0.25, fY(4)=0.30.
4) Continuous Example (Server Access Time, 0 < x < y)
Let X be connect time (ms) and Y authorization time (ms) with joint PDF on 0 < x < y:
For a, b ≥ 0 and m = min(a, b),
Geometry shows where to integrate; probability is the integral of the pdf over the darker region.
Check: with a = 1000, b = 2000, the probability ≈ 0.915.
4.1) Worked Derivation of P(X ≤ a, Y ≤ b) (textbook style)
Region split (b ≥ a case)
Compute I1
Compute I2
Add and simplify
(The underlined term comes from 2e−0.003a − 3e−0.003a.)
P(X ≤ a, Y ≤ b) = 1 − e−0.003m − 3 e−0.002b (1 − e−0.001m).
Plug in a = 1000, b = 2000
P = 1 − e−3 − 3 e−4 (1 − e−1) = 0.915480. Complement = 0.084520.
5) Continuous (Very Clear): Solve k, Marginals, CDF + 3D View (support 0 < x < y)
This triangular-support example is independent from the server-time example above.
5.1 Define the PDF and find k
Let f(x,y) = k · e−(x+y) on 0 < x < y < ∞ (0 otherwise). Find k by enforcing ∬ f = 1:
k e−(x+y) dx dy
      = k ∫0∞ e−y (1 − e−y) dy
      = k (1 − 1/2) = k · 1/2.
    k = 2, so f(x,y) = 2 e−(x+y) on 0 < x < y.5.2 Marginal PDFs
5.3 Joint CDF F(x,y) = P(X ≤ x, Y ≤ y)
Because the support is 0 < x < y, the rectangle [0,x]×[0,y] intersects the triangle differently depending on y ≤ x or y > x.
Left shows the geometry (where to integrate).
          Right: 3D wireframe is the pdf surface z = 2·e−(x+y) (domain clipped to 0≤x,y≤5). Drag to rotate.
          
Probability = the volume under this surface above the darker region on the left.
        
6) Limits & Order of Integration — FAQ (very clear)
6.1 Why are the inner limits 0 to y?
Support: S = { (x,y): 0 < x < y < ∞ }. Fix y. The support allows x only between 0 and y. Therefore,
6.2 Can I integrate dy first instead?
Yes. Fix x. Then the support allows y from x to ∞. So the equivalent order is
By Fubini/Tonelli (nonnegative integrable pdf), both orders give the same result.
6.3 Where do the −∞ bounds appear?
In the joint CDF definition:
For normalization of a pdf, integrate over the support, not −∞..∞ blindly.
6.4 Normalization solved both ways (they match)
Set either equal to 1 ⇒ k = 2.
6.5 Geometry recap
S is the triangle above the line y=x in the first quadrant. For each y>0, x runs 0→y (vertical slices). For each x>0, y runs x→∞ (horizontal slices). Limits always come from S.
7) Student FAQ
Short questions with short answers. Click to expand.
Q. What is a joint PDF vs. a joint PMF?
PDF fXY(x,y) is for continuous X,Y (a surface). Probability of a region R is the area/volume under the surface: ∬R fXY.
Q. What is the support and why do I care?
All integration limits come from the support. If the pdf is 0 outside, you do not integrate there.
Q. How do I check if a joint pdf is valid?
Example: f(x,y)=k e−(x+y) on 0<x<y ⇒ k=2.
Q. When do I integrate dx first vs dy first?
For 0<x<y: vertical slices ⇒ x:0→y; horizontal slices ⇒ y:x→∞. Both give the same result.
Q. Why do I sometimes see −∞ in integrals?
For normalization or marginals, integrate over the support, not −∞..∞ blindly.
Q. How do I get marginals from a joint pdf?
fY(y)=∫ f(x,y) dx with x-limits from the support at that y.
Example (0<x<y): fX(x)=∫y=x∞ 2e−(x+y)dy=2e−2x.
Q. What does “independent” mean here?
If X and Y are independent, knowing one tells you nothing about the other.
Q. Does zero correlation mean independence?
Q. How do I compute P(a<X<b, c<Y<d) with a pdf?
Q. How do I get a conditional pdf?
For 0<x<y with f=2e−(x+y): fX|Y(x|y) = (2e−(x+y))/(2e−y − 2e−2y) = e−x/(1 − e−y) for 0<x<y.
Q. Discrete table: how do I compute a conditional probability?
Independence test: check if each cell ≈ column-marginal × row-marginal.
Q. Common mistakes checklist
• Forgetting that PDF ≠ probability at a point (only areas give probability).
• Mixing up inner/outer limits; draw the region first.
• Assuming independence without checking factorization.
• Dropping units: keep track of what x and y represent.