In this vignette, we explain how
one can compute marginal likelihoods, Bayes factors, and posterior model
probabilities using a simple hierarchical normal model implemented in
Stan
. This vignette uses the same models and data as the Jags
vignette.
The model that we will use assumes that each of the n observations yi (where i indexes the observation, i = 1, 2, ..., n) is normally distributed with corresponding mean θi and a common known variance σ2: yi ∼ 𝒩(θi, σ2). Each θi is drawn from a normal group-level distribution with mean μ and variance τ2: θi ∼ 𝒩(μ, τ2). For the group-level mean μ, we use a normal prior distribution of the form 𝒩(μ0, τ02). For the group-level variance τ2, we use an inverse-gamma prior of the form Inv-Gamma(α, β).
In this example, we are interested in comparing the null model ℋ0, which posits that the group-level mean μ = 0, to the alternative model ℋ1, which allows μ to be different from zero. First, we generate some data from the null model:
library(bridgesampling)
### generate data ###
set.seed(12345)
mu <- 0
tau2 <- 0.5
sigma2 <- 1
n <- 20
theta <- rnorm(n, mu, sqrt(tau2))
y <- rnorm(n, theta, sqrt(sigma2))
Next, we specify the prior parameters μ0, τ02, α, and β:
Next, we implement the models in Stan
. Note that to
compute the (log) marginal likelihood for a Stan
model, we
need to specify the model in a certain way. Instad of using
"~"
signs for specifying distributions, we need to directly
use the (log) density functions. The reason for this is that when using
the "~"
sign, constant terms are dropped which are not
needed for sampling from the posterior. However, for computing the
marginal likelihood, these constants need to be retained. For instance,
instead of writing y ~ normal(mu, sigma)
we would need to
write target += normal_lpdf(y | mu, sigma)
. The models can
then be specified and compiled as follows (note that it is necessary to
install rstan
for this):
library(rstan)
# models
stancodeH0 <- 'data {
int<lower=1> n; // number of observations
vector[n] y; // observations
real<lower=0> alpha;
real<lower=0> beta;
real<lower=0> sigma2;
}
parameters {
real<lower=0> tau2; // group-level variance
vector[n] theta; // participant effects
}
model {
target += inv_gamma_lpdf(tau2 | alpha, beta);
target += normal_lpdf(theta | 0, sqrt(tau2));
target += normal_lpdf(y | theta, sqrt(sigma2));
}
'
stancodeH1 <- 'data {
int<lower=1> n; // number of observations
vector[n] y; // observations
real mu0;
real<lower=0> tau20;
real<lower=0> alpha;
real<lower=0> beta;
real<lower=0> sigma2;
}
parameters {
real mu;
real<lower=0> tau2; // group-level variance
vector[n] theta; // participant effects
}
model {
target += normal_lpdf(mu | mu0, sqrt(tau20));
target += inv_gamma_lpdf(tau2 | alpha, beta);
target += normal_lpdf(theta | mu, sqrt(tau2));
target += normal_lpdf(y | theta, sqrt(sigma2));
}
'
# compile models
stanmodelH0 <- stan_model(model_code = stancodeH0, model_name="stanmodel")
stanmodelH1 <- stan_model(model_code = stancodeH1, model_name="stanmodel")
Now we can fit the null and the alternative model in
Stan
. One usually requires a larger number of posterior
samples for estimating the marginal likelihood than for simply
estimating the model parameters. This is the reason for using a
comparatively large number of samples for these simple models.
# fit models
stanfitH0 <- sampling(stanmodelH0, data = list(y = y, n = n,
alpha = alpha,
beta = beta,
sigma2 = sigma2),
iter = 50000, warmup = 1000, chains = 3, cores = 1)
stanfitH1 <- sampling(stanmodelH1, data = list(y = y, n = n,
mu0 = mu0,
tau20 = tau20,
alpha = alpha,
beta = beta,
sigma2 = sigma2),
iter = 50000, warmup = 1000, chains = 3, cores = 1)
Computing the (log) marginal likelihoods via the
bridge_sampler
function is now easy: we only need to pass
the stanfit
objects which contain all information
necessary. We use silent = TRUE
to suppress printing the
number of iterations to the console:
# compute log marginal likelihood via bridge sampling for H0
H0.bridge <- bridge_sampler(stanfitH0, silent = TRUE)
# compute log marginal likelihood via bridge sampling for H1
H1.bridge <- bridge_sampler(stanfitH1, silent = TRUE)
We obtain:
## Bridge sampling estimate of the log marginal likelihood: -37.53183
## Estimate obtained in 5 iteration(s) via method "normal".
## Bridge sampling estimate of the log marginal likelihood: -37.79683
## Estimate obtained in 4 iteration(s) via method "normal".
We can use the error_measures
function to compute an
approximate percentage error of the estimates:
# compute percentage errors
H0.error <- error_measures(H0.bridge)$percentage
H1.error <- error_measures(H1.bridge)$percentage
We obtain:
## [1] "0.143%"
## [1] "0.164%"
To compare the null model and the alternative model, we can compute
the Bayes factor by using the bf
function. In our case, we
compute BF01, that is, the
Bayes factor which quantifies how much more likely the data are under
the null versus the alternative model:
## Estimated Bayes factor in favor of H0.bridge over H1.bridge: 1.30343
In this case, the Bayes factor is close to one, indicating that there
is not much evidence for either model. We can also compute posterior
model probabilities by using the post_prob
function:
# compute posterior model probabilities (assuming equal prior model probabilities)
post1 <- post_prob(H0.bridge, H1.bridge)
print(post1)
## H0.bridge H1.bridge
## 0.5658657 0.4341343
When the argument prior_prob
is not specified, as is the
case here, the prior model probabilities of all models under
consideration are set equal (i.e., in this case with two models to 0.5).
However, if we had prior knowledge about how likely both models are, we
could use the prior_prob
argument to specify different
prior model probabilities:
# compute posterior model probabilities (using user-specified prior model probabilities)
post2 <- post_prob(H0.bridge, H1.bridge, prior_prob = c(.6, .4))
print(post2)
## H0.bridge H1.bridge
## 0.6616079 0.3383921