flowchart TD S -- 0.2 --> S__and__X_false["X=false"] S -- 0.8 --> S__and__X_true["X=true"]
Expectations
Outline
Topics
- Expectation for discrete random models
- Law of the Unconscious Statistician
Rationale
Expectation si the main tool to translate a posterior distribution into the various outputs of Bayesian inference (point estimate, credible intervals, prediction, action).
Expectation of a single random variable
Recall: \[\mathbb{E}[X] = \sum_x x p_X(x),\] where the sum is over the point masses of \(X\), i.e. \(\{x : p_X(x) > 0\}\).
Example: compute \(\mathbb{E}[X]\) if \(X \sim {\mathrm{Bern}}(p)\), with \(p = 0.8\).
Law of the Unconscious Statistician
Proposition: if \(g\) is some function, \[\mathbb{E}[g(X)] = \sum_x g(x) p_X(x).\]
Example: compute \(\mathbb{E}[X^2]\) if \(X \sim {\mathrm{Bern}}(p)\), and hence \(\operatorname{Var}[X] = \mathbb{E}[X^2] - (\mathbb{E}[X])^2\).
Question: Compute \(\mathbb{E}[1/(X+1)]\), where \(X \sim {\mathrm{Bern}}(1/3)\).
- 3/5
- 3/4
- 5/6
- 1/3
- None of the above
Use \(g(x) = 1/(x+1)\) in LOTUS:
\[ \mathbb{E}[1/(X+1)] = \sum_{x\in\{0, 1\}} g(x) p(x) = \frac{2}{3} \frac{1}{0+1} + \frac{1}{3}\frac{1}{1+1} = \frac{5}{6}. \]
Expectation of a function of several random variables
Let us go back to our running example:
- Imagine a bag with 3 coins each with a different probability parameter \(p\)
- Coin \(i\in \{0, 1, 2\}\) has bias \(i/2\)—in other words:
- First coin: bias is \(0/2 = 0\) (i.e. both sides are “heads”, \(p = 0\))
- Second coin: bias is \(1/2 = 0.5\) (i.e. standard coin, \(p = 1/2\))
- Third coin: bias is \(2/2 = 1\) (i.e. both sides are “tails”, \(p = 1\))
- Consider the following two steps sampling process
- Step 1: pick one of the three coins, but do not look at it!
- Step 2: flip the coin 4 times
- Mathematically, this probability model can be written as follows: \[ \begin{align*} X &\sim {\mathrm{Unif}}\{0, 1, 2\} \\ Y_i | X &\sim {\mathrm{Bern}}(X/2) \end{align*} \tag{1}\]
Example: computing \(\mathbb{E}[X (Y_1+1)]\) (similar to what you will be doing in the exercise in question 1.1)
Note: this is of the form \(\mathbb{E}[g(\dots)]\), so we can use the Law of the Unconscious Statistician.
How to do it:
- first, identify \(g\), here it is \(g(x, y_1, \dots, y_4) = x(y_1+1)\) (in the exercise it is slightly different)
- denote by \(p\) the joint PMF of all the random variables in the model
- compute the expectation using \[\mathbb{E}[g(X, Y_1, \dots, Y_4)] = \sum_x \sum_{y_1} \sum_{y_2} \dots \sum_{y_4} g(x, y_1, \dots, y_4) p(x, y_1, y_2, y_3, y_4).\]
- Each sum runs over the point mass of its PMF as before, e.g. \(x \in \{0, 1, 2\}\).
- Recall: \(p(x, y_1, y_2, y_3, y_4)\) can be computed using the chain rule.
Recall the decision tree, how to visualize the above equation?
flowchart TD S__and__X_0 -- 1.0 --> S__and__X_0__and__Y1_false["Y1=false"] S__and__X_2__and__Y1_true -- 1.0 --> S__and__X_2__and__Y1_true__and__Y2_true["Y2=true"] S -- 0.33 --> S__and__X_0["X=0"] S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true__and__Y4_false["Y4=false"] S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true__and__Y4_false["Y4=false"] S__and__X_0__and__Y1_false__and__Y2_false__and__Y3_false -- 1.0 --> S__and__X_0__and__Y1_false__and__Y2_false__and__Y3_false__and__Y4_false["Y4=false"] S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false__and__Y4_true["Y4=true"] S -- 0.33 --> S__and__X_1["X=1"] S__and__X_2__and__Y1_true__and__Y2_true__and__Y3_true -- 1.0 --> S__and__X_2__and__Y1_true__and__Y2_true__and__Y3_true__and__Y4_true["Y4=true"] S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false__and__Y4_true["Y4=true"] S__and__X_1__and__Y1_false__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false["Y3=false"] S__and__X_1__and__Y1_true__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false["Y3=false"] S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false__and__Y4_false["Y4=false"] S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false__and__Y4_false["Y4=false"] S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true__and__Y4_true["Y4=true"] S__and__X_1 -- 0.5 --> S__and__X_1__and__Y1_false["Y1=false"] S__and__X_1__and__Y1_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false["Y2=false"] S__and__X_1__and__Y1_false__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true["Y3=true"] S__and__X_1__and__Y1_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true["Y2=true"] S__and__X_0__and__Y1_false -- 1.0 --> S__and__X_0__and__Y1_false__and__Y2_false["Y2=false"] S__and__X_1__and__Y1_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false["Y2=false"] S__and__X_1__and__Y1_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true["Y2=true"] S__and__X_2 -- 1.0 --> S__and__X_2__and__Y1_true["Y1=true"] S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true__and__Y4_false["Y4=false"] S__and__X_1 -- 0.5 --> S__and__X_1__and__Y1_true["Y1=true"] S -- 0.33 --> S__and__X_2["X=2"] S__and__X_1__and__Y1_true__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false["Y3=false"] S__and__X_2__and__Y1_true__and__Y2_true -- 1.0 --> S__and__X_2__and__Y1_true__and__Y2_true__and__Y3_true["Y3=true"] S__and__X_1__and__Y1_false__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true["Y3=true"] S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false__and__Y4_false["Y4=false"] S__and__X_1__and__Y1_true__and__Y2_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true["Y3=true"] S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_true__and__Y4_true["Y4=true"] S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_false__and__Y4_true["Y4=true"] S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_true__and__Y3_false__and__Y4_true["Y4=true"] S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true__and__Y4_false["Y4=false"] S__and__X_1__and__Y1_true__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_true["Y3=true"] S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false -- 0.5 --> S__and__X_1__and__Y1_true__and__Y2_false__and__Y3_false__and__Y4_false["Y4=false"] S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_true__and__Y4_true["Y4=true"] S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_true__and__Y3_true__and__Y4_true["Y4=true"] S__and__X_1__and__Y1_false__and__Y2_false -- 0.5 --> S__and__X_1__and__Y1_false__and__Y2_false__and__Y3_false["Y3=false"] S__and__X_0__and__Y1_false__and__Y2_false -- 1.0 --> S__and__X_0__and__Y1_false__and__Y2_false__and__Y3_false["Y3=false"]
Question: Assuming the path are summed left to right in the above diagram, what is the last term in the LOTUS’ iterated sum? Use the convention ‘false’ = 0, ‘true’ = 1.
- 4/3
- 3/4
- 4
- 1/3
- None of the above.
The last term, \(g(x, y_1, y_2, y_3, y_4) p(x, y_1, y_2, y_3, y_4)\), for \(x = 2, y_1 = y_2 = y_3 = y_4 = 1\) is the product of: \[ g(2, 1, 1, 1, 1) = 2(1 + 1) = 4, \] and, via chain rule, \[ p(x, y_1, y_2, y_3, y_4) = \frac{1}{3} \times 1 \times 1 \times 1 \times 1 = 1/3 \]