Interpreting coefficients in GLMs

May 7, 2018 7 min read statistics-links

This is an archive of an external source. The original is here

Date: November 11, 2016

Author: Gordana Popovic

In linear models, the interpretation of model parameters is linear. For example, if a you were modelling plant height against altitude and your coefficient for altitude was -0.9, then plant height will decrease by 1.09 for every increase in altitude of 1 unit.

For generalised linear models, the interpretation is not this straightforward. Here, I will explain how to interpret the co-efficients in generalised linear models (glms). First you will want to read our pages on glms for binary and count data page on interpreting coefficients in linear models.

Poisson and negative binomial GLMs

In Poisson and negative binomial glms, we use a log link. The actual model we fit with one covariate $x$ looks like this

$Y \sim Poisson (λ)$ $l o g (λ) = β_{0} + β_{1} x$

here $λ$ is the mean of Y. So if we have an initial value of the covariate $x_{0}$ , then the predicted value of the mean $λ_{0}$ is given by

$l o g (λ_{0}) = β_{0} + β_{1} x_{0}$

If we now increase the covariate by 1, we get a new mean $λ_{1}$ ,

$l o g (λ_{1}) = β_{0} + β_{1} (x_{0} + 1) = β_{0} + β_{1} x_{0} + β_{1} = l o g (λ_{0}) + β_{1}$

So the log of the mean of Y increases by $β_{1}$ when we increase x by 1. But we are not really interested in how the log mean changes, we would like to know on average how Y changes. If we take the exponential of both sides

$λ_{1} = λ_{0} e x p (β_{1})$

So the mean of Y is multiplied by $e x p (β_{1})$ when we increase $x$ by 1 unit.

N <- 120
x <- rnorm(N)
mu <- exp(1+0.2*x)
Y <- rpois(N, lambda = mu)
glm1 <- glm(Y~x, family = poisson)
glm1$coefficients

## (Intercept)           x 
##   1.0628067   0.1278092

exp(glm1$coefficients[2])

##        x 
## 1.136336

So here increasing $x$ by 1 unit multiplies the mean value of Y by $e x p (β_{1}) = 1.25$ . The same thing is true for negative binomial glms as they have the same link function.

Binomial GLMs

#### Logistic regression
Things become much more complicated in binomial glms. The model here is actually a model of log odds, so we need to start with an explanation of those. The odds of an event are the probability success divided by the probability of failure. So if the probability of success is $p$ then the odds are:

$Odds = \frac{p}{1 - p}$

As p increases, so do the odds. The equation for a logistic regression looks like this:

$Y \sim binomial (p)$ $l o g (\frac{p}{1 - p}) = β_{0} + β_{1} x$

Skipping some maths that is very similar to the above, we can obtain an interpretation for the coefficient of $x$ in the model in terms of the odds. When we increase $x$ by one unit the odds are multiplied by $e x p (β_{1})$ . Odds are not the most intuitive thing to interpret, but they do increase when p increases, so that if your coefficient $β_{1}$ is positive, increasing $x$ will increase your probability.

bY <- Y>0 #turning counts into presence absence
bin1 <- glm(bY~x,family = binomial)
summary(bin1)

## 
## Call:
## glm(formula = bY ~ x, family = binomial)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4383   0.3373   0.3465   0.3526   0.3861  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  2.78857    0.39303   7.095 1.29e-12 ***
## x            0.06468    0.37057   0.175    0.861    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 53.366  on 119  degrees of freedom
## Residual deviance: 53.335  on 118  degrees of freedom
## AIC: 57.335
## 
## Number of Fisher Scoring iterations: 5

So when we increase $x$ by one unit, the odds of Y are multiplied by $e x p (β_{1}) = 2.11$

Complementary log-log

Possibly a more intuitive model is a binomial regression with a complementary log-log link function. This link function is based on the assumption that you have some counts, which are Poisson distributed, but you’ve decided to turn them into presence/absence.

$Y \sim binomial (p)$ $l o g (- l o g (1 - p)) = β_{0} + β_{1} x$

In that case you can interpret your coefficients in a similar way as the Poisson regression. When you increase $x$ by 1, the mean of your underlying count (which you have turned into presence/absence) is multiplied by $e x p (β_{1})$ .

library(mvabund)
bin2 <- manyglm(bY~x, family = binomial(link = "cloglog"))
coef(bin2)

##                     bY
## (Intercept) 1.04635177
## x           0.02013689

The interpretation is now the same as in the Poisson case, when we increase $x$ by 1, the mean of the underlying count is multiplied by $e x p (β_{1})$ .

Log binomial model

It is possible to use a log link function with the binomial distribution family = binomial(link = log). In this case you can interpret the coefficients as multiplying the probabilities by $e x p (β_{1})$ , however these models can give you predicted probabilities greater than 1, and often don’t converge (don’t give an answer).

Offsets

Sometimes we know the effect of a particular variable (call it $z$ ) on the response is proportional, so that when we double $z$ we expect the response to double on average. The most common time you see this is with sampling intensity.

If you sample soil and count critters, all other things being equal, you would expect twice the critters in twice the amount of soil. If you have a variable like this it is tempting to divide your response (count) by the amount of soil to standardise the data. Unfortunately this will take counts, which we know how to model with glms, and turn them into something we do not know how to model. Fortunately this situation is easily dealt with using offsets. First, let’s simulate some data for amount of soil, depth (our predictor variable) and count data (with a poisson distribution) where the couunts depend on how much soil was sampled.

soil <- exp(rbeta(N, shape1 = 8, shape2 = 1))
depth <- rnorm(N)
mu <- soil*exp(0.5+0.5*depth)
count <- rpois(N, lambda = mu)

Now, we can model counts with depth as our predictor and soil quantity as an offset.

off_mod <- glm(Y~depth+offset(log(soil)), family = poisson)
summary(off_mod)

## 
## Call:
## glm(formula = Y ~ depth + offset(log(soil)), family = poisson)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4974  -0.6830  -0.1568   0.5521   2.5910  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  0.16048    0.05368   2.989   0.0028 **
## depth        0.03153    0.05458   0.578   0.5635   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 141.22  on 119  degrees of freedom
## Residual deviance: 140.88  on 118  degrees of freedom
## AIC: 470.36
## 
## Number of Fisher Scoring iterations: 5

If we ignored the soil amount, we could have misleading conclusions. If the soil amount is correlated with another variable in your model, then leaving out the offset will affect the coefficient of that variable, as in the discussion of conditional/marginal interpretations here. The offset will also often account for a lot of the variation in the response, so including it will give you a better model overall. What if you’re not sure if the relationship is exactly proportional? In that case just include the variable in your model as a coefficient, and the model will decide the best relationship between it and your response.

coef_mod <- glm(Y~depth+log(soil), family = poisson)
summary(coef_mod)

## 
## Call:
## glm(formula = Y ~ depth + log(soil), family = poisson)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.4669  -0.6503  -0.2537   0.5792   2.6574  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.48414    0.61501   0.787    0.431
## depth        0.03267    0.05469   0.597    0.550
## log(soil)    0.64191    0.67865   0.946    0.344
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 141.93  on 119  degrees of freedom
## Residual deviance: 140.61  on 117  degrees of freedom
## AIC: 472.08
## 
## Number of Fisher Scoring iterations: 5

The coefficient the model estimated is close to 1, which would be equivalent to an offset.

Author: Gordana Popovic

```

glm