5  Bayesian version of the linear mixed model

We illustrate how this model can be solved in a Bayesian frame work. We use to that end the R library brms.

The syntax is very similar to the lmer model. The output is slightly different but all (or similar) elements we obtained with the previous algorithms are found here as well. Technically speaking, the estimate is here the median of the posterior distribution of a coefficient. The ‘ci’ stands her for credible intervals, which are derived from quantiles of the posterior distribution. Remark that no prior is used in this case, which leads to estimates and intervals that are very close to the lmer model.

If this all sound like gibberish to you, don’t mind. It is because you are not familiar with Bayesian analysis, which uses different a philosophy about data, how they come about and making inference based on those data (a philosophy which is actually akin to the random effect philosophy and shrinkage). It is not our aim at this moment to explain the difference with the more traditional approaches, yet we would like to indicate that the linear model is applicable in both contexts, the interpretation is similar and the the practical application in R not much more difficult. A very practical advantage is that it use algorithms that are more robust and flexible, albeit computing intensive.

The flexibility allows a straight forward extension to more complex models, in particular when the residuals are not normally distributed (generalized linear mixed models).

However, we start simple and retake the model modlmer.1.1 on the unequally replicated data of chicken weights.

Remark the similarity between both packages in setting up the model.

modbrm.1.1 <- brm(data=chickens,
                  endweight~supplier + (1|cage),
                  warmup = 1000, iter = 4000, chains = 4)
Compiling Stan program...
Start sampling
summary(modbrm.1.1)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: endweight ~ supplier + (1 | cage) 
   Data: chickens (Number of observations: 88) 
  Draws: 4 chains, each with iter = 4000; warmup = 1000; thin = 1;
         total post-warmup draws = 12000

Group-Level Effects: 
~cage (Number of levels: 22) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)    19.91      6.89     6.69    34.47 1.00     3074     3881

Population-Level Effects: 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept      1851.40     10.83  1830.40  1872.96 1.00     6489     7085
supplierAV102    -9.66     17.65   -44.63    25.34 1.00     6979     6617
supplierBK77    -45.12     17.67   -79.75    -9.61 1.00     7755     7556
supplierCM10      1.70     17.32   -32.70    36.38 1.00     7102     7725
supplierLG22     -5.43     17.35   -39.54    28.93 1.00     6859     7487

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma    33.89      3.09    28.44    40.55 1.00     8808     8743

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The output is slightly different but contains also the random effect ~cage (19.91), and the residual standard deviation (sigma, 33.89), with values very similar to what was obtained in the REML solution.

Also the fixed effect estimates are very similar.

summary(modlmer.1.1)$coefficients
                 Estimate Std. Error      t value
(Intercept)   1851.520833   10.46964 176.84665264
supplierAV102   -9.958333   16.55395  -0.60156823
supplierBK77   -45.302083   16.55395  -2.73663206
supplierCM10     1.547917   16.55395   0.09350736
supplierLG22    -5.520833   16.55395  -0.33350540
fixef(modbrm.1.1)
                 Estimate Est.Error       Q2.5       Q97.5
Intercept     1851.399532  10.83218 1830.39842 1872.958092
supplierAV102   -9.661970  17.64551  -44.62863   25.344796
supplierBK77   -45.120679  17.67481  -79.74689   -9.610244
supplierCM10     1.703489  17.32417  -32.69540   36.376559
supplierLG22    -5.431660  17.34902  -39.54307   28.931550

As expected this is the code for the cages divided over rooms.

modbrms.1.2 <- brm(data=chickens,
                   endweight~supplier + (1|room/cage))
Compiling Stan program...
Start sampling
summary(modbrms.1.2)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: endweight ~ supplier + (1 | room/cage) 
   Data: chickens (Number of observations: 88) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Group-Level Effects: 
~room (Number of levels: 7) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)    14.05      9.32     0.78    36.39 1.00     1183     1547

~room:cage (Number of levels: 22) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)    15.91      7.58     1.80    31.57 1.00     1003     1290

Population-Level Effects: 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept      1852.56     11.71  1829.26  1876.68 1.00     2523     2432
supplierAV102   -13.73     16.53   -45.99    19.68 1.00     2819     2495
supplierBK77    -44.27     15.98   -75.14   -11.70 1.00     2696     2494
supplierCM10     -0.96     16.51   -33.20    33.70 1.00     2182     1949
supplierLG22     -7.12     16.30   -39.79    24.96 1.00     2404     2518

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma    33.83      3.06    28.51    40.36 1.00     2905     2883

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The Bayesian inference is based on posterior samples. These are the samples of the parameters that have been accepted of being able to produce the obtained data. The samples can be used very easily to do all sorts of comparisons, combinations, etc. See for example the code below to get a credibility intervals, and a sort of p-value for BK77 (difference with the reference) or for the differences between BK77 and AV1O2.

draws <- as.data.frame(modbrm.1.1)
# to see how this looks like: the first 10 samples for the first 2 coefficients
draws[1:10,1:5]
   b_Intercept b_supplierAV102 b_supplierBK77 b_supplierCM10 b_supplierLG22
1     1856.707       -9.046575      -50.88793      0.7610522      -5.088599
2     1858.315      -26.704335      -49.32711    -11.8958620     -20.970575
3     1851.208      -23.236517      -43.95605      7.0857549      13.719163
4     1849.474      -25.315941      -40.27164      0.4799097      -7.461762
5     1849.857      -14.901464      -51.84279     -7.4637350     -14.891804
6     1859.788      -15.438986      -56.89661     -4.4154129     -20.052013
7     1849.459       -3.885772      -42.99888    -20.6123274       1.308401
8     1844.780       -1.611583      -31.21265      6.4372187      -7.144690
9     1856.060      -14.576563      -55.10912    -18.4766190     -15.012992
10    1853.230      -18.497803      -54.40711     12.4601095      -2.676818

The mean, median and 95% credibility interval:

mean(draws$b_supplierBK77)
[1] -45.12068
quantile(draws$b_supplierBK77,c(0.5,0.025,0.975))
       50%       2.5%      97.5% 
-45.142320 -79.746889  -9.610244 

This corresponds to what was already provides as Estimate, l-95% CI and u-95% CI in summary(modbrm.1.1).

The probabilities to find a difference bigger than zero (so probability that BK77 is better than Our104):

sum(draws$b_supplierBK77 > 0)/nrow(draws)
[1] 0.00725

Or the probability that BK77 is better than AV102:

draws <- draws %>% mutate(difBK77AV102 = b_supplierBK77 - b_supplierAV102)
sum(draws$difBK77AV102 > 0)/nrow(draws)
[1] 0.03541667