Theoretical guarantee from the Law of Large Numbers

Rationale

Simple Monte Carlo is the foundation for more complex Monte Carlo methods used by Bayesian practioners (e.g. Importance Sampling and Markov chain Monte Carlo (MCMC))

Approximation of expectations using forward simulation

As before, we want to compute an expectation \(\mathbb{E}[g(X, Y)]\)

Imagine a very large decision tree, but where most branches have very low probability and only few have large probability

In this case, instead of computing the exact expectation by iterating over each of the leaves as before, we will approximate expectations using forward simulation (a method know as simple Monte Carlo)

This is done as follows:

Call your forward simulator \(M\) times.

Denote the output at iteration \(m \in \{1, 2, \dots M\}\) by: \[(X^{(m)}, Y^{(m)}) \sim p_{X, Y}(\cdot)\]

Compute \(g\) on each, call each of the \(M\) outputs \(G^{(m)}\)\[G^{(m)}= g(X^{(m)}, Y^{(m)}).\]

Return the average \[\hat G_M = \frac{1}{M} \sum_{m=1}^M G^{(m)}.\]

Intuitively, the output \(\hat G_M\) has the nice property: \[\hat G_M \approx \mathbb{E}[g(X, Y)]. \tag{1}\]

\[
\begin{align*}
X &\sim {\mathrm{Unif}}\{1, 2, 3, 4\} \\
Y | X &\sim {\mathrm{Unif}}\{1, \dots, X\}.
\end{align*}
\]

Interpretation: you roll a D&D d4 dice (blue one on the image), then pick an integer uniformly between 1 and the number on the dice.

Let us use Monte Carlo to estimate \(\mathbb{E}[X^Y]\):

require(extraDistr)

Loading required package: extraDistr

set.seed(4)# Recall our forward simulation code from earlier:forward_simulate_roll_and_pick <-function() { x <-rdunif(1, min=1, max=4) # 1 uniform between 1-6, i.e. a dice y <-rdunif(1, min=1, max=x)c(x, y) # return a vector with these two realizations}sum <-0.0n_iterations <-10000for (iteration in1:n_iterations) { sample <-forward_simulate_roll_and_pick() sum <- sum + sample[1]^sample[2]}print(sum/n_iterations)

[1] 25.7831

Guarantees from the Law of Large Numbers

Question: How can we make \(\approx\) more formal in Equation 1?

Proposition (Law of Large Numbers, LLN): if \(Z_1, Z_2, \dots\) are i.i.d. random variables with \(\mathbb{E}|Z_i| < \infty\), then^{1}\[ \frac{1}{M} \sum_{m=1}^M Z_m \to \mathbb{E}[Z_1].\]

Picking \(Z_m = G^{(m)}\) we arrive to the following formalization of \(\approx\): for any approximation error tolerance, we can find a number of iterations \(M\) large enough such that we will be within that error tolerance with high probability after \(M\) iterations.

Footnotes

Here “\(\to\)” denotes a suitable notion of convergence of random variables. In STAT 302 you may have seen LLN with “convergence in probability”, but this can be strengthen to “convergence almost sure.” The difference between these notions of convergence will not matter in this course.↩︎