Exercise 6: Hierarchical models
Q.1: efficacy of vaccines
In this exercise we will model the effectiveness of COVID vaccines using clinical trials data. Each trial consists of two arms: vaccinated (i.e., treated) and control (i.e., not treated). For a typical trial we know
- The total number of patients in each arm: \(t_\text{v}\) (vaccinated), \(t_\text{c}\) (control).
- The number of patients that got infected with the SARS-CoV-2 virus in each arm: \(n_\text{v}\), \(n_\text{c}\).
We model \(n_\text{v}\) and \(n_\text{c}\) as Binomial random variables. The unknown parameter for these distributions will depend on two numbers in \([0,1]\)
- Incidence \(p\): the probability that a patient in the trial will become infected without being treated with the vaccine.
- Effectiveness \(e\): the decrease in incidence that the vaccine provides.
We put a beta prior on both parameters. Since the parameters of the \({\mathrm{Beta}}(\alpha,\beta)\) distribution are not very interpretable, we will work with a re-parametrized version \({\mathrm{BetaMP}}(\mu,\lambda)\) where \(\mu\in[0,1]\) is the mean and \(\lambda>0\) is a precision parameter. The bijection is given by \[ \mu = \frac{\alpha}{\alpha+\beta} \quad \lambda=\alpha+\beta \qquad\iff\qquad \alpha = \mu \lambda \quad \beta = (1-\mu)\lambda \]
Finally, we put priors on the parameters of these beta distributions to complete the model as follows \[ \begin{aligned} \mu_\text{e} &\sim {\mathrm{Unif}}(0,1) && (\text{diffuse prior})\\ \lambda_\text{e} &\sim {\mathrm{Exp}}(0.01) && (\text{diffuse prior})\\ \mu_\text{p} &\sim {\mathrm{BetaMP}}(0.1, 15) && (\text{incidence is usually low})\\ \lambda_\text{p} &\sim {\mathrm{Exp}}(0.01) && (\text{diffuse prior})\\ e|\mu_\text{e},\lambda_\text{e} &\sim {\mathrm{BetaMP}}(\mu_\text{e}, \lambda_\text{e}) \\ p|\mu_\text{p},\lambda_\text{p} &\sim {\mathrm{BetaMP}}(\mu_\text{p}, \lambda_\text{p}) \\ n_\text{c}|p &\sim {\mathrm{Binom}}(t_\text{c}, p) \\ n_\text{v}|e,p &\sim {\mathrm{Binom}}(t_\text{v}, p(1-e)) \end{aligned} \tag{1}\]
Now, consider the following data—available here—arising from clinical trials of two popular COVID vaccines
- Expand the model in Equation 1 into a hierarchical model that covers both vaccines. The parameters \((\mu_\text{e},\lambda_\text{e},\mu_\text{p},\lambda_\text{p})\) must be shared across vaccines. In contrast, each vaccine must have its own \((e,p)\) pair.
- Implement the model from part 1 in simPPLe. Your PPL function should return the indicator that Moderna is more effective than Pfizer. Use this code to define \({\mathrm{BetaMP}}(\mu, \lambda)\)
= function(mean, precision){
BetaMP Beta(mean*precision, (1-mean)*precision)
}
- Download the data and load it using the following (no points for this part)
= read.csv("vaccines.csv")
vaccines $groupSizes = as.double(vaccines$groupSizes) # needed due to bug in Binom code vaccines
- Run your model in simPPLe using
posterior
with 10,000 iterations, and report the estimated posterior probability that Moderna is more effective than Pfizer. Is the SNIS approximation reliable?