📊 Chapter 3 Review – Discrete Random Variables & Distributions
🔑 Core Ideas
Excel with a PMF table:
Assume x-values in A2:A10 and probabilities in B2:B10 (with SUM(B2:B10)=1)
Mean (E[X]) =SUMPRODUCT(A2:A10, B2:B10)
Second moment E[X^2] =SUMPRODUCT((A2:A10)^2, B2:B10)
Variance =SUMPRODUCT((A2:A10)^2,B2:B10) - (SUMPRODUCT(A2:A10,B2:B10))^2
Std dev =SQRT( SUMPRODUCT((A2:A10)^2,B2:B10) - (SUMPRODUCT(A2:A10,B2:B10))^2 )
Check Σp(x)=1 =SUM(B2:B10)
🎲 Discrete Uniform Distribution
Model: X takes each integer from a to b (inclusive) with equal probability.
f(x) = 1 / (b − a + 1), x = a, a+1, …, b
E[X] = (a + b)/2
Var(X) = { (b − a + 1)² − 1 } / 12
Mean =(a+b)/2
Variance =(((b-a+1)^2)-1)/12
PMF for x (if you list the support in A2:A? and set probs in B2:B?): set all B cells to =1/(b-a+1)
🧪 Binomial Distribution — X ~ Binomial(n, p)
Model: Number of successes in n independent Bernoulli trials (success prob. p).
P(X = x) = C(n, x) p^x (1 − p)^{n − x}, x = 0,1,…,n
E[X] = n p, Var(X) = n p (1 − p)
PMF (exactly x) =BINOM.DIST(x, n, p, FALSE)
CDF (≤ x) =BINOM.DIST(x, n, p, TRUE)
Right tail (≥ k) =1 - BINOM.DIST(k-1, n, p, TRUE)
Range a..b =BINOM.DIST(b,n,p,TRUE) - BINOM.DIST(a-1,n,p,TRUE)
Quantile (min x with F≥α)=BINOM.INV(n, p, α)
Normal approx. if n large and p not extreme (use continuity correction). Poisson approx. if n large, p small, with λ=np.
🎯 Geometric Distribution — “Trials until first success”
Model: X = number of trials needed for the first success. Support: 1,2,3,…
P(X = k) = (1 − p)^{k − 1} p
F(k) = P(X ≤ k) = 1 − (1 − p)^k
E[X] = 1/p, Var(X) = (1 − p) / p²
Excel (via Negative Binomial with r=1):
PMF at k =NEGBINOM.DIST(k-1, 1, p, FALSE)
CDF at k (≤k) =NEGBINOM.DIST(k-1, 1, p, TRUE)
“Lack of memory”: P(X > s+t | X > s) = P(X > t).
📦 Negative Binomial — X = trials to get r successes
Model: Counts trials until the r-th success (r = 1 recovers geometric).
P(X = k) = C(k−1, r−1) p^r (1 − p)^{k − r}, k = r, r+1, …
E[X] = r/p, Var(X) = r(1 − p)/p²
Excel note: NEGBINOM.DIST uses “number of failures” (f) before r successes.
If X = total trials, then f = X − r.
PMF at X=k =NEGBINOM.DIST(k - r, r, p, FALSE)
CDF at X=k =NEGBINOM.DIST(k - r, r, p, TRUE)
📈 Poisson Distribution — X ~ Poisson(λ)
Model: Counts of events in a fixed interval when events occur independently at constant average rate λ.
P(X = x) = e^{−λ} λ^x / x!, x = 0,1,2,…
E[X] = λ, Var(X) = λ
PMF (exactly x) =POISSON.DIST(x, lambda, FALSE)
CDF (≤ x) =POISSON.DIST(x, lambda, TRUE)
Right tail (≥ k) =1 - POISSON.DIST(k-1, lambda, TRUE)
Range a..b =POISSON.DIST(b,lambda,TRUE) - POISSON.DIST(a-1,lambda,TRUE)
Binomial(n,p) ≈ Poisson(λ=np) when n is large and p is small (λ moderate).
🧠 Quick Checks & Mini Examples
1) PMF sanity check
Given a table of x and f(x), verify f(x)≥0 and Σ f(x)=1. If not, it’s not a valid PMF.
2) Discrete Uniform
Roll a fair die (a=1, b=6): E[X]=(1+6)/2=3.5; Var(X)=((6)^2−1)/12=35/12≈2.917.
3) Binomial
20 T/F questions, random guessing: n=20, p=0.5 ⇒ E[X]=10, Var=5. P(X=10)=BINOM.DIST(10,20,0.5,FALSE)≈0.176.
4) Geometric
Guessing 1 of 4 choices: p=0.25 ⇒ E[X]=4. P(X=3)=(0.75)^2*0.25=0.140625.
5) Negative Binomial
Trials to 5th success, p=0.25: E[X]=20, Var=60. P(X=10)=NEGBINOM.DIST(10-5,5,0.25,FALSE)≈0.0146.
6) Poisson
ER arrivals rate λ=3/hour. P(X=5)=POISSON.DIST(5,3,FALSE)≈0.1008. P(X≤2)=POISSON.DIST(2,3,TRUE)≈0.423.
📎 Excel Cheat Sheet (Chapter 3)
Generic PMF table
E[X] =SUMPRODUCT(A2:A10,B2:B10)
E[X^2] =SUMPRODUCT((A2:A10)^2,B2:B10)
Var(X) =SUMPRODUCT((A2:A10)^2,B2:B10) - (SUMPRODUCT(A2:A10,B2:B10))^2
σ =SQRT( above )
Check =SUM(B2:B10)
Discrete Uniform
Mean =(a+b)/2
Var =(((b-a+1)^2)-1)/12
Distribution functions
Binomial PMF =BINOM.DIST(x, n, p, FALSE)
Binomial CDF =BINOM.DIST(x, n, p, TRUE)
Binomial Quantile =BINOM.INV(n, p, α)
Geometric PMF =NEGBINOM.DIST(k-1, 1, p, FALSE)
Geometric CDF =NEGBINOM.DIST(k-1, 1, p, TRUE)
NegBin PMF (X=k) =NEGBINOM.DIST(k - r, r, p, FALSE)
NegBin CDF =NEGBINOM.DIST(k - r, r, p, TRUE)
Poisson PMF =POISSON.DIST(x, lambda, FALSE)
Poisson CDF =POISSON.DIST(x, lambda, TRUE)
Right tail (≥k) =1 - POISSON.DIST(k-1, lambda, TRUE)
Remember parameterizations: Excel’s NEGBINOM.DIST(number_f, trials, probability_s, cumulative)
uses number of failures before r successes. If your variable is “total trials” X, then number_f = X − r
.