suppressPackageStartupMessages(require(rstan))
suppressPackageStartupMessages(require(ggplot2))
suppressPackageStartupMessages(require(dplyr))
set.seed(1)
df = read.csv(url("https://raw.githubusercontent.com/UBC-Stat-ML/web447/main/data/launches.csv")) %>% filter(LV.Type == "Ariane 1")
success_indicators = df$Suc_bin
rmarkdown::paged_table(df)Stan: hands on
Outline
We revisit the time-varying Ariane 1 rocket failure probability model to practice the Stan syntax introduced this week.

plot(success_indicators, xlab = "Launch index i")
Model
Recall the model we discussed previously (with the difference that we use a SD of 10 now)
\[\begin{align*} \text{slope} &\sim \mathcal{N}(0, 10) \\ \text{intercept} &\sim \mathcal{N}(0, 10) \\ \theta(i) &= \text{logistic}(\text{slope} \cdot i + \text{intercept}) \\ y_i &\sim {\mathrm{Bern}}(\theta(i)) \end{align*}\]
…which you also implemented in simPPLe as part of exercise 4:
logistic_regression = function() {
  intercept = simulate(Norm(0, 10))
  slope     = simulate(Norm(0, 10))
  for (i in seq_along(success_indicators)){
    success_probability = plogis(intercept + i*slope)
    observe(success_indicators[i], Bern(success_probability))
  }
  return(c(intercept, slope))
}Translation into Stan
Question 1: use the template below to translate the above model into Stan. Set the seed to 1, run 10000 MCMC iterations, and report the posterior mean of the slope parameter.
data { 
  int N; 
  array[N] int y; 
}
parameters { 
  real slope;
  real intercept;
}
model {
  // complete this
}The value closest to the posterior mean of the slope parameter is…
- 0.6
- 0.3
- 0.1
- -0.7
- None of the above
data { 
  int N; 
  array[N] int y; 
}
parameters { 
  real slope;
  real intercept;
}
model {
  slope ~ normal(0, 10);
  intercept ~ normal(0, 10);
  for (i in 1:N) {
    y[i] ~ bernoulli(inv_logit(intercept + slope * i));
  }
}fit = sampling(
  logistic, 
  data = list( 
          y = success_indicators, 
          N = length(success_indicators)
        ), 
  chains = 1,
  iter = 10000       
)
SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.046 seconds (Warm-up)
Chain 1:                0.046 seconds (Sampling)
Chain 1:                0.092 seconds (Total)
Chain 1: fitInference for Stan model: anon_model.
1 chains, each with iter=10000; warmup=5000; thin=1; 
post-warmup draws per chain=5000, total post-warmup draws=5000.
           mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
slope      0.58    0.01 0.44 -0.12  0.26  0.52  0.82  1.61  1009    1
intercept -0.71    0.06 1.89 -4.46 -1.90 -0.64  0.55  2.86  1056    1
lp__      -5.59    0.03 1.19 -8.85 -6.07 -5.23 -4.74 -4.41  1195    1
Samples were drawn using NUTS(diag_e) at Thu Mar  7 11:04:48 2024.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).Hence the correct choice was 0.6.
Using the posterior
Question 2: compute a 95% credible interval on the slope parameter.
Use the output of print on the fit object.
- \([-4.5, 2.9]\)
- \([-0.3, 1.4]\)
- \([-0.1, 1.6]\)
- \([0.2, 0.9]\)
- None of the above
Using again the print out of the fit object:
fitInference for Stan model: anon_model.
1 chains, each with iter=10000; warmup=5000; thin=1; 
post-warmup draws per chain=5000, total post-warmup draws=5000.
           mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
slope      0.58    0.01 0.44 -0.12  0.26  0.52  0.82  1.61  1009    1
intercept -0.71    0.06 1.89 -4.46 -1.90 -0.64  0.55  2.86  1056    1
lp__      -5.59    0.03 1.19 -8.85 -6.07 -5.23 -4.74 -4.41  1195    1
Samples were drawn using NUTS(diag_e) at Thu Mar  7 11:04:48 2024.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).We can get a quantile-based credible interval by looking at the columns 2.5% and 97.5% (i.e., chopping off 5%/2 on each side): this yields \([-0.12, 1.61]\).
Question 3: translate “your updated belief that the Ariane 1 rockets were improving” into a mathematical expression.
- \(\mathbb{E}[\text{slope}| Y = y] > 0\)
- \(\mathbb{E}[\text{slope}] > 0\)
- \(\mathbb{P}(\text{slope} > 0 | Y = y)\)
- \(\mathbb{P}(\text{slope} > 0)\)
- None of the above
“Ariane 1 rockets were improving” translates to the event \((\text{slope} > 0)\). Hence our “updated belief,” which is just the posterior probability, is \(\mathbb{P}(\text{slope} > 0 | Y = y)\).
Question 4: estimate the numerical value of the expression in the last question.
- 0.65
- 0.75
- 0.85
- 0.95
- None of the above
Here we want to add a new random variable to our model, defined as: \[\text{slope\_is\_positive} = \mathbb{1}[\text{slope} > 0].\] Why? Because then the posterior mean of the indicator will yield the required probability: \[\mathbb{E}[\mathbb{1}[\text{slope} > 0]|Y = y] = \mathbb{P}(\text{slope} > 0 | Y = y),\] and we will be able to read off that mean from the print out of the fit object.
Now, how to add the slope_is_positive random variable to our Stan model?
Since the random variable is defined using “=” rather than “\(\sim\)”, we can do this using a transformed parameters, introduced earlier this week.
Here though notice that is_slope_positive is not used within the model block, in such case, an alternative is to use a generated quantities, which we demonstrate here:
data { 
  int N; 
  array[N] int y; 
}
parameters { 
  real slope;
  real intercept;
}
model {
  slope ~ normal(0, 10);
  intercept ~ normal(0, 10);
  for (i in 1:N) {
    y[i] ~ bernoulli(inv_logit(intercept + slope * i));
  }
}
generated quantities {
  real is_slope_positive = slope > 0 ? 1 : 0;
}fit = sampling(
  logistic_with_generated, 
  data = list( 
          y = success_indicators, 
          N = length(success_indicators)
        ), 
  chains = 1,
  iter = 10000       
)
SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.047 seconds (Warm-up)
Chain 1:                0.049 seconds (Sampling)
Chain 1:                0.096 seconds (Total)
Chain 1: fitInference for Stan model: anon_model.
1 chains, each with iter=10000; warmup=5000; thin=1; 
post-warmup draws per chain=5000, total post-warmup draws=5000.
                   mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
slope              0.57    0.01 0.41 -0.11  0.30  0.53  0.81  1.51  1216    1
intercept         -0.74    0.05 1.81 -4.30 -1.87 -0.72  0.40  2.74  1314    1
is_slope_positive  0.95    0.01 0.23  0.00  1.00  1.00  1.00  1.00  1535    1
lp__              -5.51    0.03 1.12 -8.47 -5.95 -5.21 -4.71 -4.41  1241    1
Samples were drawn using NUTS(diag_e) at Thu Mar  7 11:05:05 2024.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).Hence the correct answer is 0.95, the posterior mean of is_slope_positive.