Point estimates
Outline
Topics
- Common point estimates:
- Posterior mean.
- Posterior mode.
Rationale
It is often necessary to summarize the posterior distribution with a single “best guess”, even though as we will see this hides important information namely our uncertainty about that guess.
Definitions
- Let \(\pi(x) = \mathbb{P}(X = x | Y = y)\) denote a posterior PMF.
- Point estimate: Instead of plotting the full information in \(\pi\), we can report a “location” summary such as the mean of the posterior \(\pi\).
Posterior mean
Recall the mean is computed from a PMF via \[\sum_x x\; \pi(x),\] where the sum is over \(\{x : \pi(x) > 0 \}\).
Notation: the posterior mean is denoted \(\mathbb{E}[X | Y = y] = \sum x\ \pi(x)\).
Posterior mode
The mode is the location of the “tallest stick” in the PMF.
Notation: \(\operatorname{arg\,max}\pi(x),\) i.e. the point that achieves the maximum of \(\pi\).
In the Bayesian context, the mode of a posterior PMF is also known as the Maximum A Posteriori (MAP) estimator.
Examples
Example 1
In the following, we will compute point estimates based on \(\pi(x) = \mathbb{P}(X = x | Y = (1, 1))\) in the bag of coin running example.
- Imagine a bag with 3 coins each with a different probability parameter \(p\)
- Coin \(i\in \{0, 1, 2\}\) has bias \(i/2\)—in other words:
- First coin: bias is \(0/2 = 0\) (i.e. both sides are “heads”, \(p = 0\))
- Second coin: bias is \(1/2 = 0.5\) (i.e. standard coin, \(p = 1/2\))
- Third coin: bias is \(2/2 = 1\) (i.e. both sides are “tails”, \(p = 1\))
- Consider the following two steps sampling process
- Step 1: pick one of the three coins, but do not look at it!
- Step 2: flip the coin 4 times
- Mathematically, this probability model can be written as follows: \[ \begin{align*} X &\sim {\mathrm{Unif}}\{0, 1, 2\} \\ Y_i | X &\sim {\mathrm{Bern}}(X/2) \end{align*} \tag{1}\]
Question: compute the posterior mean of \(\pi\).
- 0.5
- 1.8
- 2.25
- 3.5
- None of the above
First, compute the unnormalized posterior \(\gamma \propto \pi\): \[\gamma = (\gamma(0), \gamma(1), \gamma(2)) = (1/3) (0^2, (1/2)^2, 1^2),\] then normalize: \[\pi = \gamma / Z = (0, 1/5, 4/5).\] Finally, compute the conditional expectation: \[\mathbb{E}[X | Y = (1, 1)] = \sum x \; \pi(x) = (0, 1, 2) \cdot (0, 1/5, 4/5) = 9/5 = 1.8.\]
Common mistake: forgetting normalization step: \[\sum x \; \pi(x) \neq \sum x \; \gamma(x).\]
- This “common mistake” highlights that we really need \(Z\) to compute posterior expectations using the exact, exhaustive approach (i.e. the method we are using here).
- When we talk more about Monte Carlo methods, we will see that these methods allow us to approximate expecations without having to compute \(Z\)!
Question: compute the posterior mode of \(\pi\).
- 0
- 1
- 2
- 4/5
- None of the above
Since the highest value of \(\pi\) is achieved at \(x = 2\), the answer is \(2\).
Example 2
You will practice computing the posterior mean/mode in question 2 of the exercises.