Installing and running Stan
Outline
Topics
- What is Stan?
- Links to install.
Rationale
SNIS and simPPLe are very flexible and relatively easy to understand, but it can be very slow.
Stan is an alternative way to approximate posterior distributions, with complementary properties:
| SNIS | Stan | |
|---|---|---|
| Speed | Slow | Faster1 |
| Flexibility | Very flexible | Less flexible2 |
| Easy to understand? | Simple | More complex3 |
From a pedagogical point of view, it is useful to first learn about SNIS, however for real-world models, one would typically use Stan, or some other advanced inference method, due to the poor scalability of SNIS.
What is Stan?
- Stan is the most popular PPL as of 2024.
- Review the notes on “what is a PPL.”
- Stan uses Markov chain Monte Carlo (MCMC) to approximate the posterior distribution.
- Think of MCMC as a drop-in replacement for SNIS.
- We will talk about it in more detail soon.
Installing Stan
You will need Stan installed to complete the exercises and clicker questions after the quiz. Don’t wait until the last minute, install it today!
There are two main steps to install Stan:
Let us know on Piazza if you encounter any issues!
Running Stan
We present two methods for running Stan in the next two sections: either from an R script, or from a notebook.
Template: to quickly get started, download the following templates which you can use as a starting point for either R script or notebook.
From a R script
First, copy and paste the following code into a file called beta_binomial.stan:
beta_binomial.stan
data {
int<lower=0> n; // number of trials
int<lower=0,upper=n> k; // number of successes
}
parameters {
real<lower=0,upper=1> p;
}
model {
// prior
p ~ beta(1,1);
// likelihood
k ~ binomial(n, p);
}Second, run Stan as follows:
require(rstan)
fit = stan(
seed = 123,
file = "beta_binomial.stan", # Stan program
data = list(n=3, k=3), # named list of data
iter = 1000 # number of samples to draw
)The first question of the next exercise will be to report the posterior median, which can be obtained under the column “50%” of the output of the following R command:
print(fit)From a notebook
To see an example of how to integrate Stan code inside quarto (R markdown would work the same), see the source code used for the next page in the notes, available on github.
Footnotes
Often MCMC algorithms such as stan scale polynomially in dimension while SNIS scales exponentially.↩︎
For example, Stan does not support latent integer-value random variables, whereas simPPLe does.↩︎
simPPLe is a few dozen lines of code, whereas Stan has millions of lines of code.↩︎