source("../exercises/ex03_scaffold.R")
source("../../solutions/sol03_posterior.R")
<- function() {
doomsday_model = simulate(Unif(0, 5))
x observe(0.06, Unif(0, x))
return(x)
}
posterior(doomsday_model, 1000)
[1] 1.032426
Since the parameters of distributions families are continuous, it natural to use continuous random variables for them in Bayesian statistics. For example it lets us get rid of the artificial discretization \(K\) we used in last week’s exercise.
Today we will approach the calculations with both manual mathematic derivation and PPLs. Later we will increasingly rely only on PPLs. But it is still important to understand both methods.
Definition: \(X\) has density \(f(x)\) if \[\mathbb{P}(X \in A) = \int_A f(x) \mathrm{d}x.\]
LOTUS: works the same as with discrete models: \[\mathbb{E}[g(X)] = \int f(x) g(x) \mathrm{d}x.\]
Definition: \((X, Y)\) has joint density \(f(x, y)\) if \[\mathbb{P}((X, Y) \in A) = \int \int_A f(x, y) \mathrm{d}x.\]
Note that in much of what follows, everything works the same as with with discrete models except we replace sums by integrals and PMFs \(p\) by densities \(f\).
LOTUS: \[\mathbb{E}[g(X, Y)] = \int \int f(x, y) g(x, y) \mathrm{d}x \mathrm{d}y.\]
Marginalization: going from a joint density on \((X, Y)\) to the density of \(X\) only (the latter is called the marginal): \[f_X(x) = \int f(x, y) \mathrm{d}y.\]
Chain rule: \[f(x, y) = f_X(x) f_{Y|X}(y|x).\]
Bayes theorem: \[f_{X|Y}(x|y) = \frac{f(x, y)}{f_Y(y)}.\]
Consider the following joint density described using chain rule: \[\begin{align*} X &\sim {\mathrm{Unif}}(0, 5) \\ Y | X &\sim {\mathrm{Unif}}(0, X). \end{align*}\]
Goal: from the \(Y\), trying to guess the full length of the tape.
Goal: Can we guess (probabilistically) how many more human there will ever be in total?
This is known as the Doomsday argument
Joint density: which of these is the joint density?
We have \[f_X(x) = \frac{\mathbb{1}[0 < x < 5]}{5}\] and \[f_{Y|X}(y|x) = \frac{\mathbb{1}[0 < y < x]}{x},\] therefore \[f(x, y) = \frac{\mathbb{1}[0 < x < 5]}{5} \frac{\mathbb{1}[0 < y < x]}{x}.\]
Marginal density: compute the marginal \(f_Y(y)\)
\[\begin{align*} f_Y(y) &= \frac{1}{5} \int \frac{ \mathbb{1}[0 < x < 5] \mathbb{1}[0 < y < x]}{x} \mathrm{d}x \\ &= \frac{1}{5} \int_y^5 \frac{\mathrm{d}x}{x} \\ &= \frac{1}{5} (\log 5 - \log y). \end{align*}\]
Posterior density: from our marginal and Bayes’ theorem, we obtain \[f_{X|Y}(x|y) = \frac{\mathbb{1}[0<x<5] \mathbb{1}[0 < y < x]}{x (\log 5 - \log y)}.\]
Posterior expectation: compute \(\mathbb{E}[X|Y = 0.06]\)
\[\begin{align*} f_{X|Y}(x|y) &= \int_y^5 x f_{X|Y}(x|y) \mathrm{d}x \\ &= \frac{1}{\log 5 - \log y} \int_y^5 \mathrm{d}x \\ &= \frac{1}{\log 5 - \log y} (5 - y) \approx 1.117. \\ \end{align*}\]
Compare this to the simPPLe approximation
source("../exercises/ex03_scaffold.R")
source("../../solutions/sol03_posterior.R")
<- function() {
doomsday_model = simulate(Unif(0, 5))
x observe(0.06, Unif(0, x))
return(x)
}
posterior(doomsday_model, 1000)
[1] 1.032426