Point estimates

Outline

Topics

Common point estimates:
- Posterior mean.
- Posterior mode.

Rationale

It is often necessary to summarize the posterior distribution with a single “best guess”, even though as we will see this hides important information namely our uncertainty about that guess.

Definitions

Let \(\pi(x) = \mathbb{P}(X = x | Y = y)\) denote a posterior PMF.
Point estimate: Instead of plotting the full information in \(\pi\), we can report a “location” summary such as the mean of the posterior \(\pi\).

Posterior mean

Recall the mean is computed from a PMF via \[\sum_x x\; \pi(x),\] where the sum is over \(\{x : \pi(x) > 0 \}\).

Notation: the posterior mean is denoted \(\mathbb{E}[X | Y = y] = \sum x\ \pi(x)\).

Posterior mode

The mode is the location of the “tallest stick” in the PMF.

Notation: \(\operatorname{arg\,max}\pi(x),\) i.e. the point that achieves the maximum of \(\pi\).

In the Bayesian context, the mode of a posterior PMF is also known as the Maximum A Posteriori (MAP) estimator.

Examples

Example 1

In the following, we will compute point estimates based on \(\pi(x) = \mathbb{P}(X = x | Y = (1, 1))\) in the bag of coin running example.

Imagine a bag with 3 coins each with a different probability parameter \(p\)
Coin \(i\in \{0, 1, 2\}\) has bias \(i/2\)—in other words:
- First coin: bias is \(0/2 = 0\) (i.e. both sides are “heads”, \(p = 0\))
- Second coin: bias is \(1/2 = 0.5\) (i.e. standard coin, \(p = 1/2\))
- Third coin: bias is \(2/2 = 1\) (i.e. both sides are “tails”, \(p = 1\))

Consider the following two steps sampling process
- Step 1: pick one of the three coins, but do not look at it!
- Step 2: flip the coin 4 times
Mathematically, this probability model can be written as follows: \[ \begin{align*} X &\sim {\mathrm{Unif}}\{0, 1, 2\} \\ Y_i | X &\sim {\mathrm{Bern}}(X/2) \end{align*} \tag{1}\]

Question: compute the posterior mean of \(\pi\).

Click for choices

0.5
1.8
2.25
3.5
None of the above

Click for answer

First, compute the unnormalized posterior \(\gamma \propto \pi\): \[\gamma = (\gamma(0), \gamma(1), \gamma(2)) = (1/3) (0^2, (1/2)^2, 1^2),\] then normalize: \[\pi = \gamma / Z = (0, 1/5, 4/5).\] Finally, compute the conditional expectation: \[\mathbb{E}[X | Y = (1, 1)] = \sum x \; \pi(x) = (0, 1, 2) \cdot (0, 1/5, 4/5) = 9/5 = 1.8.\]

Common mistake: forgetting normalization step: \[\sum x \; \pi(x) \neq \sum x \; \gamma(x).\]

This “common mistake” highlights that we really need \(Z\) to compute posterior expectations using the exact, exhaustive approach (i.e. the method we are using here).
When we talk more about Monte Carlo methods, we will see that these methods allow us to approximate expecations without having to compute \(Z\)!

Question: compute the posterior mode of \(\pi\).

Click for choices

0
1
2
4/5
None of the above

Click for answer

Since the highest value of \(\pi\) is achieved at \(x = 2\), the answer is \(2\).

Example 2

You will practice computing the posterior mean/mode in question 2 of the exercises.