In the latest months, I've been dealing with interviews from multiple companies. In these, I was frequently asked complex probability brainteasers, which are however quite easy to solve using generating functions, so I thought that I could detail the gist of the method in a post, even though the method is obviously contained inside the standard reference on the matter: generatingfunctionology (Archived).

What is a Generating Function?

A generating function is a compact way to encode a sequence of number or of other objects (like matrices or functions). To make matters more concrete suppose we have the infinite sequence c_n for n \in \bbN of real numbers. These can be "hanged" on a formal power series like C(x) = \sum_{n \in \bbN} c_n x^n so that one can do both algebraic and analytical calculations on the series, making it tractable to compute certain interesting quantities of the sequence.

Typically the c_n sequence is defined by recurrence on earlier values, and thus it is easy to obtain a closed form expression for C(x). For example, if c_{n + 2} = \alpha c_{n+1} + \beta c_n then we can multiply the equation for all n by x^n and sum up, obtaining

\begin{align*} \sum_n c_{n+2} x^n & = \alpha \sum_n c_{n+1} x^n + \beta \sum_n c_n x^n \\ \frac1{x^2}C(x) & = \frac\alpha{x} C(x) + \beta C(x) \\ C(x) & = \frac1{1 - \alpha x - \beta x^2}. \end{align*}

Generating Functions for Probability Problems

Suppose you have a discrete random variable X with outcome probabilities p_t = P(X = t), from which one can build the generating function P(x) = \sum_t p_t x^t. We can then observe how easy it is to compute mean and variance of the probability distribution

\begin{align*} \bbE[X] & = \sum_t t\ p_t = P'(1) \\ \bbE[X^2] & = \sum_t t^2\ p_t = P'(1) + P''(1) \\ Var[X] & = P'(1) + P''(1) - P'(1)^2 \end{align*}

and notice how cool it is the "derivative trick" where one realizes that \sum_k k\ c_k x^k = x \frac{d}{dx}\lrt{\sum_k c_k x^k} = x C'(x).

In other cases you can more easily write your generating function as a product of terms P(x) = \prod_n p_n(x), and you know that the product converges. In this case one can consider Q(x) = \log P(x) = \sum_n \log p_n(x), which makes it easier to compute the mean and variances since they are given by

\begin{align*} \bbE[X] & = Q'(1) \\ Var[X] & = Q'(1) + Q''(1). \\ \end{align*}

So I would say that generating functions are definitely useful in these kind of problems, and I hope that these couple of tricks are useful to someone besides me.