📘 Sessions 3.1-3.2: Probability Distributions & Mass Functions

🔢 Probability Mass Function (PMF)

A Probability Mass Function (PMF) describes the probabilities of a discrete random variable. It tells us how likely each possible outcome is.

If X takes values x1, x2, ..., xn, then the PMF satisfies:

📌 f(x) ≥ 0 for all x → probabilities cannot be negative.
📌 Σ f(xi) = 1 → the probabilities of all possible outcomes must add up to 1 (certainty).
📌 f(x) = P(X = x) → the function gives the probability that the random variable takes the value x.

🔍 Why is this important?

The PMF is like a “probability table” — it tells you exactly how the chance is spread over outcomes.
It simplifies real experiments: instead of tracking every detailed scenario, we only track the random variable’s values.
It is the foundation for calculating other quantities such as the mean, variance, or the CDF.

🛠 Example

If X = number of heads in two coin flips, then possible values are {0, 1, 2}:

P(X=0) = 0.25 (both tails)
P(X=1) = 0.50 (one head, one tail)
P(X=2) = 0.25 (two heads)

Here the PMF is f(0)=0.25, f(1)=0.50, f(2)=0.25. Notice how they are all ≥ 0 and add up to 1.

📊 Example 3.3 – Bits in Error

Let X = number of bits in error in the next 4 transmitted bits. Given:

P(X=0) = 0.6561
P(X=1) = 0.2916
P(X=2) = 0.0486
P(X=3) = 0.0036
P(X=4) = 0.0001

📐 Example 3.4 – Wafer Contamination (Geometric)

Background: In semiconductor manufacturing, wafers are inspected for large contamination particles. Let the random variable X = the number of wafers analyzed until the first contaminated wafer is found. Suppose each wafer is contaminated independently with probability p = 0.01.

Sample space: each outcome looks like a string of clean wafers (a) followed by one contaminated wafer (p):

{ p, ap, aap, aaap, aaaap, … }

Special cases:

P(X=1) = P(p) = 0.01
P(X=2) = P(ap) = (0.99)(0.01) = 0.0099
P(X=3) = P(aap) = (0.99)²(0.01) = 0.009801

General formula (Geometric PMF):

P(X = x) = (1 - p)^{x-1} \, p ,  for x = 1, 2, 3, …

🔎 Finite illustration (n = 5)

To make the distribution concrete, we display just the first five probabilities (the true support is infinite).

Note: The geometric distribution has infinitely many outcomes. Here we cut off at x=5 just to show the pattern; the total probability over all x is 1.

📈 Cumulative Distribution Function (CDF) — Detailed

Definition

For a random variable X, the cumulative distribution function (CDF) is F(x) = P(X ≤ x). It accumulates probability up to the threshold x.

Core Properties (always true)

📌 Bounds: 0 ≤ F(x) ≤ 1 for all x.
📌 Monotonicity: If a < b, then F(a) ≤ F(b) (probability never goes down as you allow larger outcomes).
📌 Limits: lim_x→−∞ F(x) = 0 and lim_x→+∞ F(x) = 1.
📌 Right-continuity: F(x) = lim_h↓0 F(x+h). For a discrete X, F has jumps at the support points.

Discrete vs Continuous (what the graph looks like)

🎯 Discrete X: step function. Each step (jump) size equals P(X = x).
Formula: P(X = x) = F(x) − F(x⁻), where F(x⁻) is the left limit.
🌊 Continuous X: smooth accumulation. If a density f exists: F(x) = ∫_−∞^x f(t) dt and f(x) = F′(x) almost everywhere.

How to use a CDF

Intervals: P(a < X ≤ b) = F(b) − F(a).
Point probability (discrete): P(X = x) = F(x) − F(x⁻).
Quantiles: the p-quantile is any value x_p with F(x_p) ≥ p and F(x_p−) ≤ p. The median is a 0.5-quantile.
Expectations (discrete): if you know the PMF f(x), then E[X] = Σ x·f(x). Using the CDF directly: for nonnegative integer X, E[X] = Σ_k≥1 P(X ≥ k) = Σ_k≥1 (1 − F(k−)).

Quick checks you can do with any CDF

Is it nondecreasing and between 0 and 1? (If not, something’s wrong.)
Do the jumps at integers add to 1 for discrete X? (They should.)
Does F(x) approach 1 for large x? (It must.)
Can you recover P(X=x) from the jump sizes? (Yes, that’s the PMF.)

📈 CDF — Bits in Error (Discrete Step Plot)

CDF for the “bits in error” example (0–4 bits). Each jump size equals the PMF at that x: P(0)=0.6561, P(1)=0.2916, P(2)=0.0486, P(3)=0.0036, P(4)=0.0001.

🧪 Practice (read off the CDF)

From the plot, estimate P(X ≤ 5). Explain why it’s small for p=0.01.
Using the steps, compute P(3 < X ≤ 8) as F(8) − F(3).
Where would the median roughly fall for p=0.01? (Hint: solve F(x) ≈ 0.5.)

Goal: Given a piecewise CDF \(F(x)\), recover the PMF \(f(x)=P(X=x)\) by reading the jump sizes.

Given CDF

\[ F(x)= \begin{cases} 0, & x< -2\\[4pt] 0.2, & -2\le x<0\\[4pt] 0.7, & 0\le x<2\\[4pt] 1.0, & x\ge 2 \end{cases} \]

Key rule for discrete variables: \[ f(a)=P(X=a)=F(a)-F\!\left(a^{-}\right) \] The “left limit” \(F(a^-)\) is the value of \(F\) just before \(a\). This equals the size of the jump at \(x=a\).

Visualize the CDF (step plot)

Right-continuous CDF with jumps at \(x=-2,0,2\).

Compute the PMF by jump sizes

\(x=a\)	\(F(a^-)\)	\(F(a)\)	\(f(a)=F(a)-F(a^-)\)
Sum \(\sum f(a)\)

🧠 Practice Question (PMF)

A random variable X has the following distribution: P(X=0)=0.5, P(X=1)=0.4, P(X=2)=0.1. What is the expected value E[X]?

Solution:

E[X] = Σ x·P(x) = (0)(0.5) + (1)(0.4) + (2)(0.1) = 0 + 0.4 + 0.2 = 0.6

🧠 Practice Question (CDF)

A CDF is defined as:
F(x) = 0 for x < 0
F(x) = 0.5 for 0 ≤ x < 1
F(x) = 0.8 for 1 ≤ x < 2
F(x) = 1 for x ≥ 2
What is P(X = 1)?

Solution:
P(X = 1) = F(1) - F(1^-) = 0.8 - 0.5 = 0.3