---
title: "Hierarchical Normal Example (Stan)"
author: "Quentin F. Gronau"
date: "`r Sys.Date()`"
show_toc: true
output:
  knitr:::html_vignette:
    toc: yes
vignette: >
  %\VignetteIndexEntry{Hierarchical Normal Example Stan}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

In this vignette, we explain how one can compute marginal likelihoods, Bayes factors, and posterior model probabilities using a simple hierarchical normal model implemented in `Stan`. This vignette uses the same models and data as the [`Jags` vignette](bridgesampling_example_jags.html).

## Model and Data
The model that we will use assumes that each of the $n$ observations $y_i$ (where $i$ indexes the observation, $i = 1,2,...,n$) is normally distributed with corresponding mean $\theta_i$ and a common known variance $\sigma^2$: $y_i \sim \mathcal{N}(\theta_i, \sigma^2)$. Each $\theta_i$ is drawn from a normal group-level distribution with mean $\mu$ and variance $\tau^2$: $\theta_i \sim \mathcal{N}(\mu, \tau^2)$. For the group-level mean $\mu$, we use a normal prior distribution of the form $\mathcal{N}(\mu_0, \tau^2_0)$. For the group-level variance $\tau^2$, we use an inverse-gamma prior of the form $\text{Inv-Gamma}(\alpha, \beta)$.

In this example, we are interested in comparing the null model $\mathcal{H}_0$, which posits that the group-level mean $\mu = 0$, to the alternative model $\mathcal{H}_1$, which allows $\mu$ to be different from zero. First, we generate some data from the null model:

```{r}
library(bridgesampling)

### generate data ###
set.seed(12345)

mu <- 0
tau2 <- 0.5
sigma2 <- 1

n <- 20
theta <- rnorm(n, mu, sqrt(tau2))
y <- rnorm(n, theta, sqrt(sigma2))
  
```

Next, we specify the prior parameters $\mu_0$, $\tau^2_0$, $\alpha$, and $\beta$:

```{r,eval=FALSE}
### set prior parameters ###
mu0 <- 0
tau20 <- 1
alpha <- 1
beta <- 1
```

## Specifying the Models
Next, we implement the models in `Stan`. Note that to compute the (log) marginal likelihood for a `Stan` model, we need to specify the model in a certain way. Instad of using `"~"` signs for specifying distributions, we need to directly use the (log) density functions. The reason for this is that when using the `"~"` sign, constant terms are dropped which are not needed for sampling from the posterior. However, for computing the marginal likelihood, these constants need to be retained. For instance, instead of writing `y ~ normal(mu, sigma)` we would need to write `target += normal_lpdf(y | mu, sigma)`. The models can then be specified and compiled as follows (note that it is necessary to install `rstan` for this):
```{r, eval=FALSE}
library(rstan)

# models
stancodeH0 <- 'data {
  int<lower=1> n; // number of observations
  vector[n] y; // observations
  real<lower=0> alpha;
  real<lower=0> beta;
  real<lower=0> sigma2;
}
parameters {
  real<lower=0> tau2; // group-level variance
  vector[n] theta; // participant effects
}
model {
  target += inv_gamma_lpdf(tau2 | alpha, beta);
  target += normal_lpdf(theta | 0, sqrt(tau2));
  target += normal_lpdf(y | theta, sqrt(sigma2));
}
'
stancodeH1 <- 'data {
  int<lower=1> n; // number of observations
  vector[n] y; // observations
  real mu0;
  real<lower=0> tau20;
  real<lower=0> alpha;
  real<lower=0> beta;
  real<lower=0> sigma2;
}
parameters {
  real mu;
  real<lower=0> tau2; // group-level variance
  vector[n] theta; // participant effects
}
model {
  target += normal_lpdf(mu | mu0, sqrt(tau20));
  target += inv_gamma_lpdf(tau2 | alpha, beta);
  target += normal_lpdf(theta | mu, sqrt(tau2));
  target += normal_lpdf(y | theta, sqrt(sigma2));
}
'
# compile models
stanmodelH0 <- stan_model(model_code = stancodeH0, model_name="stanmodel")
stanmodelH1 <- stan_model(model_code = stancodeH1, model_name="stanmodel")
```
## Fitting the Models
Now we can fit the null and the alternative model in `Stan`. One usually requires a larger number of posterior samples for estimating the marginal likelihood than for simply estimating the model parameters. This is the reason for using a comparatively large number of samples for these simple models.
```{r, eval=FALSE}
# fit models
stanfitH0 <- sampling(stanmodelH0, data = list(y = y, n = n,
                                               alpha = alpha,
                                               beta = beta,
                                               sigma2 = sigma2),
                      iter = 50000, warmup = 1000, chains = 3, cores = 1)
stanfitH1 <- sampling(stanmodelH1, data = list(y = y, n = n,
                                               mu0 = mu0,
                                               tau20 = tau20,
                                               alpha = alpha,
                                               beta = beta,
                                               sigma2 = sigma2),
                      iter = 50000, warmup = 1000, chains = 3, cores = 1)
```

## Computing the (Log) Marginal Likelihoods
Computing the (log) marginal likelihoods via the `bridge_sampler` function is now easy: we only need to pass the `stanfit` objects which contain all information necessary. We use `silent = TRUE` to suppress printing the number of iterations to the console:
```{r, echo=FALSE}
load(system.file("extdata/", "vignette_example_stan.RData",
                     package = "bridgesampling"))
```

```{r,eval=FALSE}
# compute log marginal likelihood via bridge sampling for H0
H0.bridge <- bridge_sampler(stanfitH0, silent = TRUE)

# compute log marginal likelihood via bridge sampling for H1
H1.bridge <- bridge_sampler(stanfitH1, silent = TRUE)
```
We obtain:
```{r}
print(H0.bridge)
print(H1.bridge)
```
We can use the `error_measures` function to compute an approximate percentage error of the estimates:
```{r,eval=FALSE}
# compute percentage errors
H0.error <- error_measures(H0.bridge)$percentage
H1.error <- error_measures(H1.bridge)$percentage
```
We obtain:
```{r}
print(H0.error)
print(H1.error)
```

## Bayesian Model Comparison
To compare the null model and the alternative model, we can compute the Bayes factor by using the `bf` function.
In our case, we compute $\text{BF}_{01}$, that is, the Bayes factor which quantifies how much more likely the data are under the null versus the alternative model:
```{r}
# compute Bayes factor
BF01 <- bf(H0.bridge, H1.bridge)
print(BF01)
```
In this case, the Bayes factor is close to one, indicating that there is not much evidence for either model. We can also compute posterior model probabilities by using the `post_prob` function:
```{r}
# compute posterior model probabilities (assuming equal prior model probabilities)
post1 <- post_prob(H0.bridge, H1.bridge)
print(post1)
```
When the argument `prior_prob` is not specified, as is the case here, the prior model probabilities of all models under consideration are set equal (i.e., in this case with two models to 0.5). However, if we had prior knowledge about how likely both models are, we could use the `prior_prob` argument to specify different prior model probabilities:
```{r}
# compute posterior model probabilities (using user-specified prior model probabilities)
post2 <- post_prob(H0.bridge, H1.bridge, prior_prob = c(.6, .4))
print(post2)
```