Point estimates

Outline

Topics

  • Common point estimates:
    • Posterior mean.
    • Posterior mode.

Rationale

It is often necessary to summarize the posterior distribution with a single “best guess”, even though as we will see this hides important information namely our uncertainty about that guess.

Definitions

  • Let \(\pi(x) = \mathbb{P}(X = x | Y = y)\) denote a posterior PMF.
  • Point estimate: Instead of plotting the full information in \(\pi\), we can report a “location” summary such as the mean of the posterior \(\pi\).

Posterior mean

Recall the mean is computed from a PMF via \[\sum_x x\; \pi(x),\] where the sum is over \(\{x : \pi(x) > 0 \}\).

Notation: the posterior mean is denoted \(\mathbb{E}[X | Y = y] = \sum x\ \pi(x)\).

Posterior mode

The mode is the location of the “tallest stick” in the PMF.

Notation: \(\operatorname{arg\,max}\pi(x),\) i.e. the point that achieves the maximum of \(\pi\).

In the Bayesian context, the mode of a posterior PMF is also known as the Maximum A Posteriori (MAP) estimator.

Examples

Example 1

In the following, we will compute point estimates based on \(\pi(x) = \mathbb{P}(X = x | Y = (1, 1))\) in the bag of coin running example.

  • Imagine a bag with 3 coins each with a different probability parameter \(p\)
  • Coin \(i\in \{0, 1, 2\}\) has bias \(i/2\)—in other words:
    • First coin: bias is \(0/2 = 0\) (i.e. both sides are “heads”, \(p = 0\))
    • Second coin: bias is \(1/2 = 0.5\) (i.e. standard coin, \(p = 1/2\))
    • Third coin: bias is \(2/2 = 1\) (i.e. both sides are “tails”, \(p = 1\))

  • Consider the following two steps sampling process
    • Step 1: pick one of the three coins, but do not look at it!
    • Step 2: flip the coin 4 times
  • Mathematically, this probability model can be written as follows: \[ \begin{align*} X &\sim {\mathrm{Unif}}\{0, 1, 2\} \\ Y_i | X &\sim {\mathrm{Bern}}(X/2) \end{align*} \tag{1}\]

Question: compute the posterior mean of \(\pi\).

  1. 0.5
  2. 1.8
  3. 2.25
  4. 3.5
  5. None of the above

First, compute the unnormalized posterior \(\gamma \propto \pi\): \[\gamma = (\gamma(0), \gamma(1), \gamma(2)) = (1/3) (0^2, (1/2)^2, 1^2),\] then normalize: \[\pi = \gamma / Z = (0, 1/5, 4/5).\] Finally, compute the conditional expectation: \[\mathbb{E}[X | Y = (1, 1)] = \sum x \; \pi(x) = (0, 1, 2) \cdot (0, 1/5, 4/5) = 9/5 = 1.8.\]

Common mistake: forgetting normalization step: \[\sum x \; \pi(x) \neq \sum x \; \gamma(x).\]

  • This “common mistake” highlights that we really need \(Z\) to compute posterior expectations using the exact, exhaustive approach (i.e. the method we are using here).
  • When we talk more about Monte Carlo methods, we will see that these methods allow us to approximate expecations without having to compute \(Z\)!

Question: compute the posterior mode of \(\pi\).

  1. 0
  2. 1
  3. 2
  4. 4/5
  5. None of the above

Since the highest value of \(\pi\) is achieved at \(x = 2\), the answer is \(2\).

Example 2

You will practice computing the posterior mean/mode in question 2 of the exercises.