Section 5.1 — Joint & Marginal Distributions

All formulas are rendered with standard fonts (no external libraries), so they display correctly even when opened locally.

Joint Probability Mass Function (Discrete)

(1) Nonnegativity:   fXY(x, y) ≥ 0
(2) Normalization:   ∑xy fXY(x, y) = 1
(3) Event probability:   fXY(x, y) = P(X = x, Y = y)

1) Joint Distributions for Two Random Variables

Continuous (joint PDF):
fXY(x, y) ≥ 0,   ∬ℝ² fXY(x, y) dx dy = 1.

For a region R ⊆ ℝ²,
P{ (X, Y) ∈ R } = ∬R fXY(x, y) dx dy.

Joint CDF:
FXY(x, y) = P{ X ≤ x, Y ≤ y } = ∫−∞x−∞y fXY(u, v) dv du.

Interpretation: the joint PMF assigns probabilities to grid points (x, y); the joint PDF is a surface whose volume over a region equals the probability that (X, Y) falls in that region.

1.1) Class Survey Example — Lotte Market Visit (23 Students)

A simple real dataset: 23 JU students visited Lotte Market (Jacksonville). Each student recorded (1) minutes spent and (2) money spent ($). We’ll treat these as joint random variables:

Observed counts (out of 23 total):

Y \\ X5–10 min10–30 min30–60 minTotal fY(y)
Under $104217
$10–3025310
$30–800246
Total fX(x)69823

Convert to probabilities by dividing each count by 23.

Example: fXY(10–30, $10–30) = 5 / 23 ≈ 0.217.
Marginal fX(10–30) = (2+5+2)/23 = 9/23 ≈ 0.391.
Marginal fY($10–30) = (2+5+3)/23 = 10/23 ≈ 0.435.

Interactive 3D Joint PMF Visualization

Drag to rotate. Each bar’s height is the joint probability f(X,Y). The total volume under all bars equals 1. (Shows that joint probability is now a 3D surface over (X,Y) categories.)

1.2) Marginals & Conditional Probability — Lotte Market (n=23)

Marginal of X (minutes)

fX(x) = ∑y fXY(x,y)

Marginal of Y ($ spent)

fY(y) = ∑x fXY(x,y)

Data recap (counts): X ∈ {5–10, 10–30, 30–60},   Y ∈ {<10, 10–30, 30–80}.
Table (rows = Y, cols = X): [[4,2,1],[2,5,3],[0,2,4]] with total 23.

Conditional Probability Calculator


  

  

Reminder: For a row/column in the joint table, conditional probability is “cell ÷ that row/column total.” Example: P(Y=10–30 | X=10–30) = 5/9 ≈ 0.556, since the X=10–30 column has 9 students total.

1.3) Quick Real-World Examples of Joint Variables (no math)

Use these to recognize when you have a joint distribution. They say what X and Y are, what values they can take (the “support”), and why they’re modeled together.

Discrete–Discrete

Continuous–Continuous

Mixed (one discrete, one continuous)

Supports you’ll see (described, not computed)

Why joint?

Rule of thumb: If your question mentions both X and Y in the same event or decision, you’re in joint-distribution land.

2) Marginal Distributions

Discrete:   fX(x) = ∑y fXY(x, y),    fY(y) = ∑x fXY(x, y).

Continuous:   fX(x) = ∫−∞ fXY(x, y) dy,    fY(y) = ∫−∞ fXY(x, y) dx.

Marginal CDFs:   FX(x) = ∫−∞x fX(u) du,   FY(y) = ∫−∞y fY(v) dv.

If X and Y are independent, then fXY(x, y) = fX(x) fY(y) and FXY(x, y) = FX(x) FY(y).

2.1) ✅ TL;DR Summary — Symbols & Integration Variables

SymbolMeaningIntegration variable
fXY(x, y)Joint PDFvariables x, y
fX(x)Marginal PDF of Xintegrated over y
fY(y)Marginal PDF of Yintegrated over x
FX(x)CDF of Xintegrate fX(u) with respect to u up to x
FY(y)CDF of Yintegrate fY(v) with respect to v up to y
Note: u and v are just dummy variables of integration.

2.2) Mean & Variance from a Joint Distribution

From the marginals
E[X] = ∫−∞ x fX(x) dx,   E[Y] = ∫−∞∞ y fY(y) dy.

Var(X) = ∫−∞∞ (x−μX)2 fX(x) dx = E[X2] − μX2,   where μX=E[X].
Var(Y) = ∫−∞∞ (y−μY)2 fY(y) dy = E[Y2] − μY2.
Directly from the joint pdf/pmf
E[X] = &iint; x fX,Y(x,y) dx dy,   E[Y] = &iint; y fX,Y(x,y) dx dy.
E[X2] = &iint; x2 fX,Y(x,y) dx dy,   E[Y2] = &iint; y2 fX,Y(x,y) dx dy.

2.3) Covariance & Correlation (optional)

Cov(X,Y)=E[XY]−E[X]E[Y],   ρ=Cov(X,Y)/√(Var(X)Var(Y)).
For f(x,y)=2e−(x+y) on 0<x<y: E[XY]=1 ⇒ Cov=1−(1/2)(3/2)=1/4,   ρ=1/√5≈0.447.

Quick worked values for our examples

Discrete table (Section 3)
fX(1)=0.20,  fX(2)=0.25,  fX(3)=0.55.
fY(1)=0.28,  fY(2)=0.25,  fY(3)=0.17,  fY(4)=0.30.

E[X] = 1·0.20 + 2·0.25 + 3·0.55 = 2.35.
E[Y] = 1·0.28 + 2·0.25 + 3·0.17 + 4·0.30 = 2.49.
Triangular-support continuous (Section 5)
Marginals: fX(x)=2e−2x (x>0),   fY(y)=2e−y−2e−2y (y>0).

E[X] = 0.5,   Var(X) = 0.25.
E[Y] = 1.5,   Var(Y) = 1.25.
Tip: Use marginals when you have them; it’s usually simpler. For continuous variables, the pdf has units 1/(unitx·unity), so ∫ x·f or ∫ y·f yields unit-consistent expectations.

3) Discrete Illustration (Joint PMF Table)

A joint PMF can be shown in a table; row/column sums give the marginals.

y \\ x123Marginal fY(y)
10.010.020.250.28
20.020.030.200.25
30.020.100.050.17
40.150.100.050.30
Marginal fX(x)0.200.250.551.00

How to read this table (very clear):

What are X and Y?
X = number of requests (1–3). Y = response-time category (1–4).

Cell value: fXY(x,y) = P(X=x, Y=y).

Examples: P(3,1)=0.25; P(2,3)=0.10; P(1,4)=0.15; P(3,2)=0.20.

Total = 1.00.

What are the marginals?

fX(x): add down the column. fY(y): add across the row.

Example: fX(2)=0.25, fY(4)=0.30.

4) Continuous Example (Server Access Time, 0 < x < y)

Let X be connect time (ms) and Y authorization time (ms) with joint PDF on 0 < x < y:

fXY(x, y) = 6×10−6 · e−0.001x − 0.002y,   0 < x < y,   0 < y < ∞.

For a, b ≥ 0 and m = min(a, b),

P(X ≤ a, Y ≤ b) = (1 − e−0.003m) − 3 e−0.002b(1 − e−0.001m).
Shaded triangle = support (0 < x < y).
Darker polygon = event region (X ≤ a, Y ≤ b) inside the support.

Geometry shows where to integrate; probability is the integral of the pdf over the darker region.




Result:

Reset to (1000, 2000)
Note: Complement probability = 1 − Result.

Check: with a = 1000, b = 2000, the probability ≈ 0.915.

4.1) Worked Derivation of P(X ≤ a, Y ≤ b) (textbook style)

Region split (b ≥ a case)

P = ∫y=0ax=0y f(x,y) dx dy  +  ∫y=ab ∫x=0a f(x,y) dx dy,   where   f(x,y)=6·10−6 e−0.001x−0.002y,   0<x<y.

Compute I1

I1 = ∫0a ∫0y 6·10−6 e−0.001x−0.002y dx dy = ∫0a 6·10−6 e−0.002y [ ∫0y e−0.001x dx ] dy = ∫0a 0.006 ( e−0.002y − e−0.003y ) dy.
⇒ I1 = 0.006 [ −(1/0.002) e−0.002y + (1/0.003) e−0.003y ]0a = 1 − 3 e−0.002a + 2 e−0.003a.

Compute I2

I2 = ∫ab ∫0a 6·10−6 e−0.001x−0.002y dx dy = ∫ab 6·10−6 e−0.002y [ ∫0a e−0.001x dx ] dy = 0.006 (1 − e−0.001a) ∫ab e−0.002y dy.
⇒ I2 = 0.006 (1 − e−0.001a) [ −(1/0.002) e−0.002y ]ab = −3 e−0.002b (1 − e−0.001a) + 3 e−0.002a (1 − e−0.001a).

Add and simplify

P = I1 + I2 = 1 − e−0.003a − 3 e−0.002b (1 − e−0.001a).
(The underlined term comes from 2e−0.003a − 3e−0.003a.)
Let m = min(a,b). Then
P(X ≤ a, Y ≤ b) = 1 − e−0.003m − 3 e−0.002b (1 − e−0.001m).

Plug in a = 1000, b = 2000

m = 1000,   e−0.003m=e−3,   e−0.002b=e−4,   e−0.001m=e−1.
P = 1 − e−3 − 3 e−4 (1 − e−1) = 0.915480.   Complement = 0.084520.

5) Continuous (Very Clear): Solve k, Marginals, CDF + 3D View (support 0 < x < y)

This triangular-support example is independent from the server-time example above.

5.1 Define the PDF and find k

Let f(x,y) = k · e−(x+y) on 0 < x < y < ∞ (0 otherwise). Find k by enforcing ∬ f = 1:

1 = ∫00y k e−(x+y) dx dy = k0 e−y (1 − e−y) dy = k (1 − 1/2) = k · 1/2.
k = 2, so f(x,y) = 2 e−(x+y) on 0 < x < y.

5.2 Marginal PDFs

fX(x) = ∫y=x 2e−(x+y) dy = 2 e−2x,   x>0.
fY(y) = ∫x=0y 2e−(x+y) dx = 2 e−y − 2 e−2y,   y>0.

5.3 Joint CDF F(x,y) = P(X ≤ x, Y ≤ y)

Because the support is 0 < x < y, the rectangle [0,x]×[0,y] intersects the triangle differently depending on y ≤ x or y > x.

Case y ≤ x: F(x,y) = (1 − e−y)².
Case y > x: F(x,y) = 1 − e−2x − 2 e−y + 2 e−(x+y).
Shaded triangle = support (0 < x < y).
Darker polygon = event P(X≤a, Y≤b) inside the support.

Left shows the geometry (where to integrate).






P(X ≤ a, Y ≤ b) = =

Right: 3D wireframe is the pdf surface z = 2·e−(x+y) (domain clipped to 0≤x,y≤5). Drag to rotate.
Probability = the volume under this surface above the darker region on the left.

6) Limits & Order of Integration — FAQ (very clear)

6.1 Why are the inner limits 0 to y?

Support: S = { (x,y): 0 < x < y < ∞ }. Fix y. The support allows x only between 0 and y. Therefore,

S f(x,y) dx dy = ∫y=0x=0y f(x,y) dx dy.

6.2 Can I integrate dy first instead?

Yes. Fix x. Then the support allows y from x to ∞. So the equivalent order is

S f(x,y) dy dx = ∫x=0y=x f(x,y) dy dx.

By Fubini/Tonelli (nonnegative integrable pdf), both orders give the same result.

6.3 Where do the −∞ bounds appear?

In the joint CDF definition:

FXY(x,y) = P(X≤x, Y≤y) = ∫−∞x−∞y f(u,v) dv du.

For normalization of a pdf, integrate over the support, not −∞..∞ blindly.

6.4 Normalization solved both ways (they match)

(1)y=0x=0y k e−(x+y) dx dy = k ∫0 e−y (1 − e−y) dy = k (1 − 1/2) = k/2.
(2)x=0y=x k e−(x+y) dy dx = k ∫0∞ e−x ( ∫y=x e−y dy ) dx = k ∫0∞ e−x e−x dx = k/2.

Set either equal to 1 ⇒ k = 2.

6.5 Geometry recap

S is the triangle above the line y=x in the first quadrant. For each y>0, x runs 0→y (vertical slices). For each x>0, y runs x→∞ (horizontal slices). Limits always come from S.

7) Student FAQ

Short questions with short answers. Click to expand.

Q. What is a joint PDF vs. a joint PMF?
PMF fXY(x,y) is for discrete X,Y (probability at grid points).
PDF fXY(x,y) is for continuous X,Y (a surface). Probability of a region R is the area/volume under the surface: ∬R fXY.
Q. What is the support and why do I care?
The support is where f(x,y)>0 (allowed points). Draw it first.
All integration limits come from the support. If the pdf is 0 outside, you do not integrate there.
Q. How do I check if a joint pdf is valid?
(1) f(x,y) ≥ 0 on the support. (2) ∬support f(x,y) dx dy = 1.
Example: f(x,y)=k e−(x+y) on 0<x<y ⇒ k=2.
Q. When do I integrate dx first vs dy first?
Either order works (Fubini). Choose the one that makes limits simpler.
For 0<x<y: vertical slices ⇒ x:0→y; horizontal slices ⇒ y:x→∞. Both give the same result.
Q. Why do I sometimes see −∞ in integrals?
Only in the CDF definition: FXY(x,y)=∫−∞x−∞yf(u,v)dv du.
For normalization or marginals, integrate over the support, not −∞..∞ blindly.
Q. How do I get marginals from a joint pdf?
fX(x)=∫ f(x,y) dy with y-limits from the support at that x.
fY(y)=∫ f(x,y) dx with x-limits from the support at that y.
Example (0<x<y): fX(x)=∫y=x 2e−(x+y)dy=2e−2x.
Q. What does “independent” mean here?
Independent ⇔ fXY(x,y)=fX(x)fY(y) for all x,y (or CDFs multiply).
If X and Y are independent, knowing one tells you nothing about the other.
Q. Does zero correlation mean independence?
No. Zero covariance/correlation does not guarantee independence (except in special families like jointly normal). Do not rely on it.
Q. How do I compute P(a<X<b, c<Y<d) with a pdf?
Integrate over the intersection of the rectangle and the support. If the support is triangular (0<x<y), the rectangle might split into pieces — set up piecewise integrals or use the joint CDF if it fits.
Q. How do I get a conditional pdf?
fX|Y(x|y)=f(x,y)/fY(y) on the part of the support where Y=y.
For 0<x<y with f=2e−(x+y): fX|Y(x|y) = (2e−(x+y))/(2e−y − 2e−2y) = e−x/(1 − e−y) for 0<x<y.
Q. Discrete table: how do I compute a conditional probability?
P(X=x | Y=y) = fXY(x,y) / (∑x fXY(x,y)) = cell / row-sum.

Independence test: check if each cell ≈ column-marginal × row-marginal.
Q. Common mistakes checklist
• Integrating over −∞..∞ instead of the support.
• Forgetting that PDF ≠ probability at a point (only areas give probability).
• Mixing up inner/outer limits; draw the region first.
• Assuming independence without checking factorization.
• Dropping units: keep track of what x and y represent.