set.seed(1)
# prior: Beta(alpha, beta)
= 1
alpha = 2
beta = 3
n_trials
= function(x, y) {
buggy_joint if (x < 0 || x > 1) return(0.0)
dbeta(x, alpha, beta) * dbinom(y, size = n_trials, prob = x, log = TRUE)
}
MCMC debugging
Outline
Topics
- Exact invariance test
Rationale
It is easy for bugs to sneak in MCMC code. In the workflow week, we have briefly outlined how simulated data can be used to detect software defects. However, the test discussed there still left “MCMC too slow” as one possible issue confounding software defect.
Here we describe a new test (the exact invariance test) that excludes “MCMC too slow” from the possible causes of warning. When the exact invariance test rises a warning, the only possible causes are “software defect” and “bad luck.”
Example
Here is an implementation of the joint distribution of a beta-binomial model with a bug planted!
Can you spot the bug?
It mixes up log-scale and non-log-scale computations, a common mistake. Concretely, the log = TRUE
argument should be removed.
Can we automatically detect there is a bug?
Exact invariance test
Building block: forward simulation-initialized MCMC
The first ingredient we need is a forward simulator:
= function() {
forward = rbeta(1, alpha, beta)
x = rbinom(1, n_trials, x)
y return(
list(
x = x,
y = y
)) }
Next, let’s do the following:
- Call our forward simulator. \[(x, y) \sim \text{forward()}. \]
- Run an MCMC algorithm initialized at the \((x, y)\) we just forward simulated.
= function(joint, n_iterations) {
stationary_mcmc 1= forward()
initialization = initialization$y
y if (n_iterations == 0) {
return(initialization$x)
else {
} 2= initialization$x
current_x 3for (i in 1:n_iterations) {
= current_x + rnorm(1)
proposed_x = joint(proposed_x, y) / joint(current_x, y)
ratio if (runif(1) < ratio) {
= proposed_x
current_x
}
}return(current_x)
} }
- 1
- Call our forward simulator.
- 2
- Initialize the MCMC chain.
- 3
-
Perform
n_iterations
rounds of MCMC conditioning on they
we simulated.
- Typically, we do MCMC on \(x\) only…
- …but here, view the MCMC as a chain on pairs \((x, y)\).
- Typically, MCMC targets the conditional, \(\pi(x) = p(x | y)\)…
- …but here, view it as targeting the joint, \(\pi(x, y) = p(x, y)\).
- What is the initial distribution, \(\mu_1\)?
- By construction, we initialize with
forward()
, so, \[\mu_1(x, y) = \pi(x, y).\]
- By construction, we initialize with
Question: if the code is correct, what is the distribution \(\mu_2\) of the chain after one iteration, i.e., the distribution of \((X^{(2)}, Y^{(2)})\)?
- It is intractable.
- It is approximately \(\pi(x, y)\).
- It is exactly \(\pi(x, y)\).
- It is approximately normal.
- None of the above.
It is exactly \(\pi(x, y)\).
First, from our results on marginals of Markov chains, we have \[\mu_2(x', y') = \sum_x \mu_1(x, y) K(x', y' | x, y).\]
But since \(\mu_1(x, y) = \pi(x, y)\), we have, under the assumption that the code is \(\pi\)-invariant, that global balance holds: \[\mu_2(x', y') = \sum_x \pi(x, y) K(x', y' | x, y) = \pi(x, y).\]
Using the same argument recursively, we get that \(\mu_m = \pi\) for all \(m\).
Here we will use specifically that this is true after \(m = 200\) MCMC iteration (if the code is correct!): \[\mu_{200} = \pi.\]
Repeat 2000 times
- We now repeat what we just did 1000 times.
- We also do forward simulation alone, 1000 times (by setting
n_iterations
to zero).
= function(joint) {
exact_invariance = replicate(1000, stationary_mcmc(joint, 0))
forward_only = replicate(1000, stationary_mcmc(joint, 200))
with_mcmc
ks.test(forward_only, with_mcmc)
}
Frequentist test of distributional equality
- We now have 2 sets of samples, each i.i.d.:
forward_only
andwith_mcmc
. - Key property: if they have different distributions, we can conclude that the code is not \(\pi\)-invariant, i.e., buggy.
- How to check this?
- Let’s use frequentist tools!
- One of the rare situations where a “point-null hypothesis” makes perfect sense.
- i.e., exact equality of the distribution is precisely what we would expect if the code is correct.
= exact_invariance(buggy_joint)
test_result test_result
Asymptotic two-sample Kolmogorov-Smirnov test
data: forward_only and with_mcmc
D = 0.099, p-value = 0.0001108
alternative hypothesis: two-sided
- You do not need to know the details about this test…
- …rather, focus on its interpretation:
- we observed a certain “discrepancy” between the empirical CDFs of
forward_only
andwith_mcmc
, here 0.099 - if the code was correct, we would see a discrepancy larger or equal to that only with probability 0.0001107924
- that last probability is the infamous “p-value” which is often over-used, but a good tool in this present situation.
- we observed a certain “discrepancy” between the empirical CDFs of
Success! The p-value is tiny, we would indeed have caught that bug.
Checking that correct code passes the test
Let’s look at the fixed code:
= function(x, y) {
fixed_joint if (x < 0 || x > 1) return(0.0)
dbeta(x, alpha, beta) * dbinom(y, size = n_trials, prob = x)
}
exact_invariance(fixed_joint)
Asymptotic two-sample Kolmogorov-Smirnov test
data: forward_only and with_mcmc
D = 0.027, p-value = 0.8593
alternative hypothesis: two-sided
Again, success! The p-value is not tiny, as one would hope.