suppressPackageStartupMessages(require(rstan))
suppressPackageStartupMessages(require(ggplot2))
suppressPackageStartupMessages(require(dplyr))
set.seed(1)
= read.csv(url("https://raw.githubusercontent.com/UBC-Stat-ML/web447/main/data/launches.csv")) %>% filter(LV.Type == "Ariane 1")
df = df$Suc_bin
success_indicators ::paged_table(df) rmarkdown
Stan: hands on
Outline
We revisit the time-varying Ariane 1 rocket failure probability model to practice the Stan syntax introduced this week.
plot(success_indicators, xlab = "Launch index i")
Model
Recall the model we discussed previously (with the difference that we use a SD of 10 now)
\[\begin{align*} \text{slope} &\sim \mathcal{N}(0, 10) \\ \text{intercept} &\sim \mathcal{N}(0, 10) \\ \theta(i) &= \text{logistic}(\text{slope} \cdot i + \text{intercept}) \\ y_i &\sim {\mathrm{Bern}}(\theta(i)) \end{align*}\]
…which you also implemented in simPPLe as part of exercise 4:
= function() {
logistic_regression = simulate(Norm(0, 10))
intercept = simulate(Norm(0, 10))
slope for (i in seq_along(success_indicators)){
= plogis(intercept + i*slope)
success_probability observe(success_indicators[i], Bern(success_probability))
}return(c(intercept, slope))
}
Translation into Stan
Question 1: use the template below to translate the above model into Stan. Set the seed to 1, run 10000 MCMC iterations, and report the posterior mean of the slope parameter.
data {
int N;
array[N] int y;
}
parameters {
real slope;
real intercept;
}
model {
// complete this
}
The value closest to the posterior mean of the slope
parameter is…
- 0.6
- 0.3
- 0.1
- -0.7
- None of the above
data {
int N;
array[N] int y;
}
parameters {
real slope;
real intercept;
}
model {
0, 10);
slope ~ normal(0, 10);
intercept ~ normal(for (i in 1:N) {
y[i] ~ bernoulli(inv_logit(intercept + slope * i));
} }
= sampling(
fit
logistic, data = list(
y = success_indicators,
N = length(success_indicators)
), chains = 1,
iter = 10000
)
SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 5e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
Chain 1:
Chain 1: Elapsed Time: 0.046 seconds (Warm-up)
Chain 1: 0.046 seconds (Sampling)
Chain 1: 0.092 seconds (Total)
Chain 1:
fit
Inference for Stan model: anon_model.
1 chains, each with iter=10000; warmup=5000; thin=1;
post-warmup draws per chain=5000, total post-warmup draws=5000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
slope 0.58 0.01 0.44 -0.12 0.26 0.52 0.82 1.61 1009 1
intercept -0.71 0.06 1.89 -4.46 -1.90 -0.64 0.55 2.86 1056 1
lp__ -5.59 0.03 1.19 -8.85 -6.07 -5.23 -4.74 -4.41 1195 1
Samples were drawn using NUTS(diag_e) at Thu Mar 7 11:04:48 2024.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
Hence the correct choice was 0.6
.
Using the posterior
Question 2: compute a 95% credible interval on the slope
parameter.
Use the output of println
on the fit
object.
- \([-4.5, 2.9]\)
- \([-0.3, 1.4]\)
- \([-0.1, 1.6]\)
- \([0.2, 0.9]\)
- None of the above
Using again the print out of the fit object:
fit
Inference for Stan model: anon_model.
1 chains, each with iter=10000; warmup=5000; thin=1;
post-warmup draws per chain=5000, total post-warmup draws=5000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
slope 0.58 0.01 0.44 -0.12 0.26 0.52 0.82 1.61 1009 1
intercept -0.71 0.06 1.89 -4.46 -1.90 -0.64 0.55 2.86 1056 1
lp__ -5.59 0.03 1.19 -8.85 -6.07 -5.23 -4.74 -4.41 1195 1
Samples were drawn using NUTS(diag_e) at Thu Mar 7 11:04:48 2024.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
We can get a quantile-based credible interval by looking at the columns 2.5%
and 97.5%
(i.e., chopping off 5%/2 on each side): this yields \([-0.12, 1.61]\).
Question 3: translate “your updated belief that the Ariane 1 rockets were improving” into a mathematical expression.
- \(\mathbb{E}[\text{slope}| Y = y] > 0\)
- \(\mathbb{E}[\text{slope}] > 0\)
- \(\mathbb{P}(\text{slope} > 0 | Y = y)\)
- \(\mathbb{P}(\text{slope} > 0)\)
- None of the above
“Ariane 1 rockets were improving” translates to the event \((\text{slope} > 0)\). Hence our “updated belief,” which is just the posterior probability, is \(\mathbb{P}(\text{slope} > 0 | Y = y)\).
Question 4: estimate the numerical value of the expression in the last question.
- 0.65
- 0.75
- 0.85
- 0.95
- None of the above
Here we want to add a new random variable to our model, defined as: \[\text{slope\_is\_positive} = \mathbb{1}[\text{slope} > 0].\] Why? Because then the posterior mean of the indicator will yield the required probability: \[\mathbb{E}[\mathbb{1}[\text{slope} > 0]|Y = y] = \mathbb{P}(\text{slope} > 0 | Y = y),\] and we will be able to read off that mean from the print out of the fit object.
Now, how to add the slope_is_positive random variable to our Stan model?
Since the random variable is defined using “=” rather than “\(\sim\)”, we can do this using a transformed parameters
, introduced earlier this week.
Here though notice that is_slope_positive
is not used within the model block, in such case, an alternative is to use a generated quantities
, which we demonstrate here:
data {
int N;
array[N] int y;
}
parameters {
real slope;
real intercept;
}
model {
0, 10);
slope ~ normal(0, 10);
intercept ~ normal(for (i in 1:N) {
y[i] ~ bernoulli(inv_logit(intercept + slope * i));
}
}
generated quantities {
real is_slope_positive = slope > 0 ? 1 : 0;
}
= sampling(
fit
logistic_with_generated, data = list(
y = success_indicators,
N = length(success_indicators)
), chains = 1,
iter = 10000
)
SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 5e-06 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
Chain 1:
Chain 1: Elapsed Time: 0.047 seconds (Warm-up)
Chain 1: 0.049 seconds (Sampling)
Chain 1: 0.096 seconds (Total)
Chain 1:
fit
Inference for Stan model: anon_model.
1 chains, each with iter=10000; warmup=5000; thin=1;
post-warmup draws per chain=5000, total post-warmup draws=5000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
slope 0.57 0.01 0.41 -0.11 0.30 0.53 0.81 1.51 1216 1
intercept -0.74 0.05 1.81 -4.30 -1.87 -0.72 0.40 2.74 1314 1
is_slope_positive 0.95 0.01 0.23 0.00 1.00 1.00 1.00 1.00 1535 1
lp__ -5.51 0.03 1.12 -8.47 -5.95 -5.21 -4.71 -4.41 1241 1
Samples were drawn using NUTS(diag_e) at Thu Mar 7 11:05:05 2024.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
Hence the correct answer is 0.95
, the posterior mean of is_slope_positive
.