Generating Functions

2021 May 20 See all posts

A generating function is a device somewhat similar to a bag. Instead of carrying many little objects detachedly, which could be embarrassing, we put them all in a bag, and then we have only one object to carry, the bag.

- George Pólya, Mathematics and plausible reasoning (1954)

Probability generating function

We define a PGF as g(t)=E[tX]g(t) = E[t^X]. for this article XX will be a discrete random variable.

This is an interesting defintion since by LOTUS with pk=P(X=k)p_k = P(X = k)

g(t)=E[tX]=k=0pktkg(t) = E[t^X] = \sum_{k=0}^{\infty} p_k t^k

The crazy idea here is we can use calculus on g(t)g(t), since its an differentiable function!

g(t)=k=0pkktk1g'(t) = \sum_{k=0}^{\infty} p_k kt^{k-1}

Letting t=0t=0 every term dies except the k=1k=1 term which is p1100=p1p_1 \cdot 1 \cdot 0^0 = p_1

The general case of what we discovered is the nth derivative at zero being pnn!p_n n!

g(n)(0)=k=0pkk!(kn)!0kn=pnn!g^{(n)}(0) = \sum_{k=0}^{\infty} p_k \frac{k!}{(k - n)!} 0^{k - n} = p_n n!

That means we can recover each pk=P(X=k)p_k = P(X = k) from the PGF g(t)=E[tX]g(t) = E[t^X], amazing!

Plugging in t=1t=1 gives us E[X]E[X] by LOTUS

g(1)=k=0pkk1k1=k=0kpk=E[X]g'(1) = \sum_{k=0}^{\infty} p_k k \cdot 1^{k-1} = \sum_{k=0}^{\infty} k p_k = E[X]

This is useful because finding expected values can be easier then finding probabilies since we have linearity.

Dice problem

Let X=X1++X6X = X_1 + \dots + X_6 and let XiX_i be the number rolled by the ii'th die

Since XiX_i are independent we use the fact that E[XY]=E[X]E[Y]E[XY] = E[X]E[Y] if XX and YY indep

g(t)=E[tX]=E[tX1++X6]=E[tX1]E[tX6]=(E[tX1])6g(t) = E[t^X] = E[t^{X_1 + \dots + X_6}] = E[t^{X_1}] \cdots E[t^{X_6}] = \left(E[t^{X_1}]\right)^6

We know E[Xi]=16(t1+t2++t6)E[X_i] = \frac{1}{6}(t^1 + t^2 + \dots + t^6) by LOTUS, plugging that in gives us

g(t)=(E[tX1])6=(16(t1++t6))6g(t) = \left(E[t^{X_1}]\right)^6 = \left(\frac{1}{6}(t^1 + \dots + t^6)\right)^6

We know g(18)(0)=P(X=18)g^{(18)}(0) = P(X = 18) leading us to the amazing conclusion

The probability of six die summing to 18 is the coefficient of the t18t^{18} term

Simplifying this to make finding the coefficients easier is now a matter of algebra not probability!


Take a polynomial, say the from the PGF of summing 3 fair dice

(t1+t2++t6)3(t^1 + t^2 + \dots + t^6)^3

Think about getting the t10t^{10} term, you can pick t6t^6, t3t^3 and t1t^1. but you could also pick different terms, like t5t^5, t4t^4 and t1t^1.

Counting the ways of picking the terms that have product t10t^{10} is the same as counting dice rolls. except here we have the full power of algebra at our disposal.

Proof of expectation

(Assume X,Y are discrete, the continuous case is done by interchanging integrals)

Proof is by factoring out xP(X=x)xP(X=x) and then factoring out the sum

x=0y=0xyP(X=x)P(Y=y)=x=0xP(X=x)y=0yP(Y=y)=E[X]E[Y]\sum_{x=0}^{\infty} \sum_{y=0}^{\infty} xy P(X=x)P(Y=y) = \sum_{x=0}^{\infty} xP(X=x) \sum_{y=0}^{\infty} y P(Y=y) = E[X]E[Y]

Background story

If you've done any probability you've likely come across a problem like this

What is the probability that the total after rolling 4 fair dice is 21?

When I heard this problem in stat110 strategic practice I tried to solve it in general, after spending several hours I gave up. It turns out the solution was just to manually list cases!

This bugged me, as a programmer I hate doing things manually. but not seeing a better way I continued on.

Interestingly, a similar thing happened to Frederick Mosteller in his sophomore year[1]

  1. ↩︎