μ = E(X) = Σ x·f(x)σ² = V(X) = Σ (x - μ)²·f(x) = Σ x²·f(x) - μ²V(X) = Σx (x − μ)² f(x) = Σx x² f(x) − 2μ Σx x f(x) + μ² = Σx x² f(x) − μ²
      σ = √V(X)The mean is a weighted average. The variance shows how spread out the values are around the mean.
Let X be the number of error bits in the next four bits transmitted. X ∈ {0,1,2,3,4} with:
Mean (μ = E[X]):
μ = 0(0.6561) + 1(0.2916) + 2(0.0486) + 3(0.0036) + 4(0.0001) = 0.4
Although X never takes the value 0.4, the expected value (weighted average) is 0.4.
Variance (table method):
| x | x − 0.4 | (x − 0.4)² | f(x) | f(x)(x − 0.4)² | 
|---|---|---|---|---|
| 0 | −0.4 | 0.16 | 0.6561 | 0.104976 | 
| 1 | 0.6 | 0.36 | 0.2916 | 0.104976 | 
| 2 | 1.6 | 2.56 | 0.0486 | 0.124416 | 
| 3 | 2.6 | 6.76 | 0.0036 | 0.024336 | 
| 4 | 3.6 | 12.96 | 0.0001 | 0.001296 | 
Variance (add the last column):
Variance is the weighted average of squared deviations:
  V(X) = Σ f(x)·(x − μ)²
From the table, simply add up the last column:
| 0.104976 | + | 0.104976 | + | 0.124416 | + | 0.024336 | + | 0.001296 | 
= 0.36
  ✔️ Shortcut check: E[X²] − (E[X])² = 0.52 − (0.4)² = 0.36
✔️ Shortcut check (step by step):
  The shortcut formula is:
  V(X) = E[X²] − (E[X])²
E[X²]:
    
    E[X²] = 0²·0.6561 + 1²·0.2916 + 2²·0.0486 + 3²·0.0036 + 4²·0.0001
    
    0 + 0.2916 + 0.1944 + 0.0324 + 0.0016
    (E[X])²:
    E[X] = 0.4, so
    (E[X])² = (0.4)² = 0.16
  V(X) = 0.52 − 0.16 = 0.36
  Final Answer: V(X) = 0.36
Averaging probabilities like AVERAGE(0.6561, 0.2916, …) would tell you the mean of a list of five numbers that happen to be probabilities. That has no direct meaning for how many error bits you expect.
      We want the average of the outcomes (0,1,2,3,4) over many packets, weighted by how likely each outcome is.
      That is exactly the definition:
      μ = E[X] = Σ x·P(X=x)
    
Think of a big class experiment: for each 4-bit packet, record the number of error bits (X). After many packets, the class average of those X’s will be close to 0.4. That’s what the 0.4 “means”: on average, 0.4 errors per 4-bit packet.
      (These probabilities match a Binomial model with n=4 bits and error probability p=0.1, so 
      E[X] = n·p = 4·0.1 = 0.4.)
    
Assume each bit flips incorrectly with probability p = 0.1 independently (Binomial(4, 0.1)).
As the number of packets grows, the sample mean of X should hover near 0.4.
Design A: $3 million with P = 1 → E(X) = 3, σ = 0
    Design B: $7 million with P = 0.3, $2 million with P = 0.7
📌 Design B has higher expected return, but also higher risk.
What is E(X²) for the same PMF?
⚠️ Note: E(X²) ≠ [E(X)]²
E(aX + b) = a·E(X) + bV(aX + b) = a² · V(X)Let X have values 1, 2, 3 with probabilities 0.2, 0.5, 0.3. Find μ, σ², σ.
        μ = 1×0.2 + 2×0.5 + 3×0.3 = 2.1
        E(X²) = 1²×0.2 + 2²×0.5 + 3²×0.3 = 4.9
        σ² = E(X²) − μ² = 4.9 − (2.1)² = 0.49
        σ = √0.49 = 0.7
      
      Put your x values in A2:A6 and the matching probabilities in B2:B6 (with SUM(B2:B6)=1).
    
=SUMPRODUCT(A2:A6, B2:B6)=SUMPRODUCT((A2:A6)^2, B2:B6)=SUMPRODUCT((A2:A6 - SUMPRODUCT(A2:A6,B2:B6))^2, B2:B6)=SUMPRODUCT((A2:A6)^2, B2:B6) - (SUMPRODUCT(A2:A6,B2:B6))^2=SQRT( ... )Edit probabilities for x = 0…4 (they’ll auto-normalize). Live results and a micro-check that E[X²] ≠ [E[X]]² are shown below.
| x | P(X=x) | 
|---|
Mean:
  U = aX + b
  E[U] = E[aX + b] = a·E[X] + b
Variance:
  V(U) = E[(U − E[U])²]
       = E[(aX + b − (aE[X] + b))²]
       = E[(a(X − E[X]))²]
       = a² · E[(X − E[X])²]
       = a² · V(X)
  
  Shift b cancels (doesn’t change spread). Scaling by a stretches distances by |a|, so squared distances scale by a².
X ∈ {0,1,2,3,4} with probabilities [0.6561, 0.2916, 0.0486, 0.0036, 0.0001]. For X: μX=0.4 and V(X)=0.36.
| x | f(x) | (x−μX)² | TermX=f(x)(x−μX)² | u = a·x + b | μU | (u−μU)² | TermU=f(x)(u−μU)² | Ratio TermU/TermX | 
|---|---|---|---|---|---|---|---|---|
| Sum = V(X) | 0.36 | Sum = V(U) | 0.36 | Ratios (when defined) should equal a² | ||||
b: E[aX+b] = a·E[X] + b. In your run, μX=0.4 and μU=3·0.4+5=6.2.b but scales with a²: V(aX+b) = a²·V(X). The shift b just re-centers; it does not change spread.V(X)=0.36 → V(U)=3²·0.36 = 9·0.36 = 3.24.  
        The “Ratio” column equals a² = 9.00 on every row where the denominator is defined—this is why the totals obey V(U)=a²V(X).X is in “errors per packet,” then U=3X+5 has variance in “(3·errors)² = 9·errors².”  
        (This is why variance scales by a².)σ(U)=|a|·σ(X). Here, σ(X)=√0.36=0.6 and σ(U)=√3.24=1.8 = 3·0.6.b only (keep a=1): μ changes; V stays the same.a only (keep b=0): the “Ratio” column becomes a² everywhere; the bottom sums obey V(U)=a²V(X).a (e.g., a=-2): the ratio is still a²=4 — the sign doesn’t matter for variance.AVERAGE(0.6561, ...)) — that’s not an expected value.b changes variance — it doesn’t; only a does, via a².