Exercise 6: Hierarchical models

Q.1: efficacy of vaccines

In this exercise we will model the effectiveness of COVID vaccines using clinical trials data. Each trial consists of two arms: vaccinated (i.e., treated) and control (i.e., not treated). For a typical trial we know

  1. The total number of patients in each arm: \(t_\text{v}\) (vaccinated), \(t_\text{c}\) (control).
  2. The number of patients that got infected with the SARS-CoV-2 virus in each arm: \(n_\text{v}\), \(n_\text{c}\).

We model \(n_\text{v}\) and \(n_\text{c}\) as Binomial random variables. The unknown parameter for these distributions will depend on two numbers in \([0,1]\)

  1. Incidence \(p\): the probability that a patient in the trial will become infected without being treated with the vaccine.
  2. Effectiveness \(e\): the decrease in incidence that the vaccine provides.

We put a beta prior on both parameters. Since the parameters of the \({\mathrm{Beta}}(\alpha,\beta)\) distribution are not very interpretable, we will work with a re-parametrized version \({\mathrm{BetaMP}}(\mu,\lambda)\) where \(\mu\in[0,1]\) is the mean and \(\lambda>0\) is a precision parameter. The bijection is given by \[ \mu = \frac{\alpha}{\alpha+\beta} \quad \lambda=\alpha+\beta \qquad\iff\qquad \alpha = \mu \lambda \quad \beta = (1-\mu)\lambda \]

Finally, we put priors on the parameters of these beta distributions to complete the model as follows \[ \begin{aligned} \mu_\text{e} &\sim {\mathrm{Unif}}(0,1) && (\text{diffuse prior})\\ \lambda_\text{e} &\sim {\mathrm{Exp}}(0.01) && (\text{diffuse prior})\\ \mu_\text{p} &\sim {\mathrm{BetaMP}}(0.1, 15) && (\text{incidence is usually low})\\ \lambda_\text{p} &\sim {\mathrm{Exp}}(0.01) && (\text{diffuse prior})\\ e|\mu_\text{e},\lambda_\text{e} &\sim {\mathrm{BetaMP}}(\mu_\text{e}, \lambda_\text{e}) \\ p|\mu_\text{p},\lambda_\text{p} &\sim {\mathrm{BetaMP}}(\mu_\text{p}, \lambda_\text{p}) \\ n_\text{c}|p &\sim {\mathrm{Binom}}(t_\text{c}, p) \\ n_\text{v}|e,p &\sim {\mathrm{Binom}}(t_\text{v}, p(1-e)) \end{aligned} \tag{1}\]

Now, consider the following data—available here—arising from clinical trials of two popular COVID vaccines

  1. Expand the model in Equation 1 into a hierarchical model that covers both vaccines. The parameters \((\mu_\text{e},\lambda_\text{e},\mu_\text{p},\lambda_\text{p})\) must be shared across vaccines. In contrast, each vaccine must have its own \((e,p)\) pair.
  2. Implement the model from part 1 in simPPLe. Your PPL function should return the indicator that Moderna is more effective than Pfizer. Use this code to define \({\mathrm{BetaMP}}(\mu, \lambda)\)
BetaMP = function(mean, precision){
  Beta(mean*precision, (1-mean)*precision)
}
  1. Download the data and load it using the following (no points for this part)
vaccines = read.csv("vaccines.csv")
vaccines$groupSizes = as.double(vaccines$groupSizes) # needed due to bug in Binom code
  1. Run your model in simPPLe using posterior with 10,000 iterations, and report the estimated posterior probability that Moderna is more effective than Pfizer. Is the SNIS approximation reliable?