MH as a Markov chain

Outline

Topics

  • Homogeneous Markov chains
  • Transition kernel
  • Initial distribution
  • MH as a Markov chain

Rationale

Why is the MH ratio defined as \(\gamma(x')/\gamma(x)\)?

If we modified the MH ratio, would MCMC still “work”?

Recall: by “MCMC still work” we mean “MCMC still be consistent” i.e. enjoy a law of large numbers.

Markov chain definitions

To answer the question above:

  • we need to study in more details law of large numbers for Markov chains.
  • We will need a some concepts related to Markov chains

Recall: the random variables \(X^{(1)}, X^{(2)}, \dots\) are called a Markov chain if they admit the following “chain” graphical model.

Note: today we will assume the state space is discrete, we’ll relax this soon.

Definition: let \(K_m(x' | x) = \mathbb{P}(X^{(m+1)} = x' | X^{(m)}= x)\). We call \(K_m\) the transition kernel (for the \(m\)-th step).

Definition: if all the transition kernels are equal, \(K_m = K\) for some \(K\), we say the Markov chain is homogeneous.

Definition: the marginal distribution of the first state, \(X^{(1)}\) is called the initial distribution.

The MH algorithm as a Markov chain

Recall MH involves:

  • a proposal \(q(x'|x)\)
  • a MH ratio \(r(x, x') = \gamma(x') / \gamma(x)\), and
  • an acceptance probability \(\alpha(x, x') = \min(1, r(x, x'))\).

Question: write the transition kernel \(K(x' | x)\), assuming \(x \neq x'\).