Forward sampling

Outline

Topics

  • Notion of forward sampling (also known as forward simulation)
  • How to do it in practice
    • Useful functions
    • Graphical models

Rationale

  • Sampling is the main way Bayesian inference is performed nowadays.
  • We introduce here the simplest flavour of sampling, forward sampling.
  • Bayesian inference mostly uses a more complicated type of sampling called posterior sampling which we will cover later.
  • But forward sampling is still helpful to help debug Bayesian inference software as we will see soon.

Forward sampling as depth-first traversal

Recall our recurring bag sampling example, with its corresponding decision tree:

flowchart TD
S__and__X_0 -- 1.0 --> S__and__X_0__and__Y1_false["Y1=false"]
S__and__X_2__and__Y1_true -- 1.0 --> S__and__X_2__and__Y1_true__and__Y2_true["Y2=true"]
S -- 0.33 --> S__and__X_0["X=0"]
S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true__and__Y4_false["Y4=false"]
S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true__and__Y4_false["Y4=false"]
S__and__X_0__and__Y1_false__and__Y2_false__and__Y3_false -- 1.0 --> S__and__X_0__and__Y1_false__and__Y2_false__and__Y3_false__and__Y4_false["Y4=false"]
S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false__and__Y4_true["Y4=true"]
S -- 0.33 --> S__and__X_1["X=1"]
S__and__X_2__and__Y1_true__and__Y2_true__and__Y3_true -- 1.0 --> S__and__X_2__and__Y1_true__and__Y2_true__and__Y3_true__and__Y4_true["Y4=true"]
S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false__and__Y4_true["Y4=true"]
S__and__X_1__and__Y1_false__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false["Y3=false"]
S__and__X_1__and__Y1_true__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false["Y3=false"]
S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false__and__Y4_false["Y4=false"]
S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false__and__Y4_false["Y4=false"]
S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true__and__Y4_true["Y4=true"]
S__and__X_1 -- 0.5 --> S__and__X_1__and__Y1_false["Y1=false"]
S__and__X_1__and__Y1_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false["Y2=false"]
S__and__X_1__and__Y1_false__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true["Y3=true"]
S__and__X_1__and__Y1_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true["Y2=true"]
S__and__X_0__and__Y1_false -- 1.0 --> S__and__X_0__and__Y1_false__and__Y2_false["Y2=false"]
S__and__X_1__and__Y1_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false["Y2=false"]
S__and__X_1__and__Y1_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true["Y2=true"]
S__and__X_2 -- 1.0 --> S__and__X_2__and__Y1_true["Y1=true"]
S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true__and__Y4_false["Y4=false"]
S__and__X_1 -- 0.5 --> S__and__X_1__and__Y1_true["Y1=true"]
S -- 0.33 --> S__and__X_2["X=2"]
S__and__X_1__and__Y1_true__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false["Y3=false"]
S__and__X_2__and__Y1_true__and__Y2_true -- 1.0 --> S__and__X_2__and__Y1_true__and__Y2_true__and__Y3_true["Y3=true"]
S__and__X_1__and__Y1_false__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true["Y3=true"]
S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false__and__Y4_false["Y4=false"]
S__and__X_1__and__Y1_true__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true["Y3=true"]
S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true__and__Y4_true["Y4=true"]
S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false__and__Y4_true["Y4=true"]
S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false__and__Y4_true["Y4=true"]
S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true__and__Y4_false["Y4=false"]
S__and__X_1__and__Y1_true__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true["Y3=true"]
S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false__and__Y4_false["Y4=false"]
S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true__and__Y4_true["Y4=true"]
S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true__and__Y4_true["Y4=true"]
S__and__X_1__and__Y1_false__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false["Y3=false"]
S__and__X_0__and__Y1_false__and__Y2_false -- 1.0 --> S__and__X_0__and__Y1_false__and__Y2_false__and__Y3_false["Y3=false"]

  • Forward simulation is a type of tree traversal. I.e. moving from node to node in the tree.
  • Forward simulation is a recursive process initialized at the root of the decision tree (labelled \(S\)).
    • When we are a node \(v\) in the tree, we pick one of \(v\)’s children at random.
    • We recurse until we reach a leaf.
      • From this leaf we obtain an outcome and hence a realization for all random variables, both “observed” and “unobserved.”

Forward sampling as specifying a model

We have encountered that notation earlier:

\[ \begin{align*} X &\sim {\mathrm{Unif}}\{0, 1, 2\} \\ Y_i | X &\sim {\mathrm{Bern}}(X/2). \end{align*} \]

  • This notation is a recipe providing all the information required to perform forward sampling.
    • Specifically, the PMF to use at each recursion step.
    • In continuous models, it will be the same idea except that we will have a probability density instead of a PMF.

Example 1

You will practice forward sampling in Ex1, Q1.1.2

Example 2

Consider the following model:

\[ \begin{align*} X &\sim {\mathrm{Unif}}\{1, 2, 3, 4\} \\ Y | X &\sim {\mathrm{Unif}}\{1, \dots, X\}. \end{align*} \]

Interpretation: you roll a D&D d4 dice (blue one on the image), then pick an integer uniformly between 1 and the number on the dice.

Example of forward simulation code for the above “censored dice”:

require(extraDistr)
Loading required package: extraDistr
set.seed(4)

forward_simulate_roll_and_pick <- function() {
  x <- rdunif(1, min=1, max=4) # 1 uniform between 1-6, i.e. a dice
  y <- rdunif(1, min=1, max=x)
  c(x, y) # return a vector with these two realizations
}

forward_simulate_roll_and_pick()
[1] 3 1
forward_simulate_roll_and_pick()
[1] 2 1
forward_simulate_roll_and_pick()
[1] 4 2