πŸ“˜ Sessions 3.1-3.2: Probability Distributions & Mass Functions

πŸ”’ Probability Mass Function (PMF)

A Probability Mass Function (PMF) describes the probabilities of a discrete random variable. It tells us how likely each possible outcome is.

If X takes values x1, x2, ..., xn, then the PMF satisfies:

πŸ” Why is this important?

πŸ›  Example

If X = number of heads in two coin flips, then possible values are {0, 1, 2}:

Here the PMF is f(0)=0.25, f(1)=0.50, f(2)=0.25. Notice how they are all β‰₯ 0 and add up to 1.

πŸ“Š Example 3.3 – Bits in Error

Let X = number of bits in error in the next 4 transmitted bits. Given:

πŸ“ Example 3.4 – Wafer Contamination (Geometric)

Background: In semiconductor manufacturing, wafers are inspected for large contamination particles. Let the random variable X = the number of wafers analyzed until the first contaminated wafer is found. Suppose each wafer is contaminated independently with probability p = 0.01.

Sample space: each outcome looks like a string of clean wafers (a) followed by one contaminated wafer (p):

{ p, ap, aap, aaap, aaaap, … }

Special cases:

General formula (Geometric PMF):

P(X = x) = (1 - p)^{x-1} \, p ,  for x = 1, 2, 3, …

πŸ”Ž Finite illustration (n = 5)

To make the distribution concrete, we display just the first five probabilities (the true support is infinite).

Note: The geometric distribution has infinitely many outcomes. Here we cut off at x=5 just to show the pattern; the total probability over all x is 1.

πŸ“ˆ Cumulative Distribution Function (CDF) β€” Detailed

Definition

For a random variable X, the cumulative distribution function (CDF) is F(x) = P(X ≀ x). It accumulates probability up to the threshold x.

Core Properties (always true)

Discrete vs Continuous (what the graph looks like)

How to use a CDF

Quick checks you can do with any CDF

πŸ“ˆ CDF β€” Bits in Error (Discrete Step Plot)

CDF for the β€œbits in error” example (0–4 bits). Each jump size equals the PMF at that x: P(0)=0.6561, P(1)=0.2916, P(2)=0.0486, P(3)=0.0036, P(4)=0.0001.

πŸ§ͺ Practice (read off the CDF)

  1. From the plot, estimate P(X ≀ 5). Explain why it’s small for p=0.01.
  2. Using the steps, compute P(3 < X ≀ 8) as F(8) βˆ’ F(3).
  3. Where would the median roughly fall for p=0.01? (Hint: solve F(x) β‰ˆ 0.5.)
  1. Estimate P(X ≀ 5)
    F(5) = 1 βˆ’ (0.99)5 β‰ˆ 0.0490 (β‰ˆ 4.90%).
    πŸ‘‰ Small because p = 0.01 is tiny, so the expected wait is E[X] = 100 wafers β€” finding one in 5 is unlikely.
  2. Compute P(3 < X ≀ 8)
    F(8) = 1 βˆ’ (0.99)8 β‰ˆ 0.0773
    F(3) = 1 βˆ’ (0.99)3 β‰ˆ 0.0297
    P(3 < X ≀ 8) = F(8) βˆ’ F(3) β‰ˆ 0.0476 (β‰ˆ 4.76%).
  3. Median for p = 0.01
    Solve (0.99)x ≀ 0.5 β†’ x β‰₯ ln(0.5)/ln(0.99) β‰ˆ 68.97.
    πŸ‘‰ Median = smallest integer β‰₯ this value: 69 wafers.
Example 3.6 β€” PMF from CDF (Step-by-Step)

πŸ“˜ Example 3.6 β€” Determine PMF from CDF

Goal: Given a piecewise CDF \(F(x)\), recover the PMF \(f(x)=P(X=x)\) by reading the jump sizes.

Given CDF

\[ F(x)= \begin{cases} 0, & x< -2\\[4pt] 0.2, & -2\le x<0\\[4pt] 0.7, & 0\le x<2\\[4pt] 1.0, & x\ge 2 \end{cases} \]

Key rule for discrete variables: \[ f(a)=P(X=a)=F(a)-F\!\left(a^{-}\right) \] The β€œleft limit” \(F(a^-)\) is the value of \(F\) just before \(a\). This equals the size of the jump at \(x=a\).

Visualize the CDF (step plot)

Right-continuous CDF with jumps at \(x=-2,0,2\).

Compute the PMF by jump sizes

\(x=a\) \(F(a^-)\) \(F(a)\) \(f(a)=F(a)-F(a^-)\)
Sum \(\sum f(a)\)

🧠 Practice Question (PMF)

A random variable X has the following distribution: P(X=0)=0.5, P(X=1)=0.4, P(X=2)=0.1. What is the expected value E[X]?

Solution:

E[X] = Ξ£ xΒ·P(x) = (0)(0.5) + (1)(0.4) + (2)(0.1) = 0 + 0.4 + 0.2 = 0.6

🧠 Practice Question (CDF)

A CDF is defined as:
F(x) = 0 for x < 0
F(x) = 0.5 for 0 ≀ x < 1
F(x) = 0.8 for 1 ≀ x < 2
F(x) = 1 for x β‰₯ 2
What is P(X = 1)?

Solution:
P(X = 1) = F(1) - F(1-) = 0.8 - 0.5 = 0.3

πŸ“ˆ CDF β€” Bits in Error (Alternate Step Plot)