1 Objectives
2 Pre-requisites
3 Data
- 3.1 Read in data
- 3.2 Format data
4 Model with constant parameters
5 Deal with covariates
- 5.1 Site-level covariates
- 5.2 Observation-level covariates
6 Model selection

1 Objectives

Fit static (single-season) occupancy models
Deal with covariates
Carry out model selection

2 Pre-requisites

Load some packages:

library(tidyverse) # manipulate and visualize data
theme_set(theme_light(base_size = 16)) # set theme for visualisation and font size
library(unmarked) # occupancy modelling

3 Data

We use data from Mapping and explaining wolf recolonization in France using dynamic occupancy models and opportunistic data by Louvrier et al. in Ecography available at https://onlinelibrary.wiley.com/doi/abs/10.1111/ecog.02874. We use the year 2013 for illustration. Occasions are months December, January, February and March. A 100km\(^2\) grid was overlaid over France and detections and non-detections were collected. See paper for more details.

3.1 Read in data

Read in data:

wolf_data <- read_csv("data/wolf_occupancy.csv")

Inspect data. The column names are self-explanatory. Note that covariates effort (number of observers in a given cell), altitude and forest were standardized.

wolf_data

## # A tibble: 3,211 × 10
##    idsite  occ1  occ2  occ3  occ4 effort altitude foret      X       Y
##     <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl> <dbl>  <dbl>   <dbl>
##  1      8     0     0     0     0 -0.784   -1.03  -1.25 585000 6115000
##  2      9     0     0     0     0 -0.784    1.51  -1.25 595000 6115000
##  3     10     0     0     0     0 -0.784    0.808 -1.25 605000 6115000
##  4     11     0     0     0     0 -0.743    0.675 -1.25 615000 6115000
##  5     12     0     0     0     0 -0.703    0.683 -1.25 625000 6115000
##  6     13     0     0     0     0 -0.703    0.507 -1.25 635000 6115000
##  7     14     0     0     0     0 -0.662   -1.03  -1.25 645000 6115000
##  8     15     0     0     0     0 -0.703   -1.03  -1.25 655000 6115000
##  9     16     0     0     0     0 -0.743   -1.03  -1.25 665000 6115000
## 10     68     0     0     0     0 -0.784    2.02  -1.25 585000 6125000
## # ℹ 3,201 more rows

There are > 3000 cells.

nrow(wolf_data)

## [1] 3211

3.2 Format data

We pick the detections and non-detections, and format them for use in unmarked:

y <- wolf_data[,2:5]
wolf <- unmarkedFrameOccu(y = y)

Inspect:

head(wolf)

## Data frame representation of unmarkedFrame object.
##    y.1 y.2 y.3 y.4
## 1    0   0   0   0
## 2    0   0   0   0
## 3    0   0   0   0
## 4    0   0   0   0
## 5    0   0   0   0
## 6    0   0   0   0
## 7    0   0   0   0
## 8    0   0   0   0
## 9    0   0   0   0
## 10   0   0   0   0

Summarise data:

summary(wolf)

## unmarkedFrame Object
## 
## 3211 sites
## Maximum number of observations per site: 4 
## Mean number of observations per site: 4 
## Sites with at least one detection: 171 
## 
## Tabulation of y observations:
##     0     1 
## 12540   304

Visualize:

plot(wolf)

4 Model with constant parameters

Fit a model with constant parameters, detection first, then occupancy:

fm <- occu(~ 1 ~ 1, data = wolf)

Inspect results:

fm

## 
## Call:
## occu(formula = ~1 ~ 1, data = wolf)
## 
## Occupancy:
##  Estimate     SE     z   P(>|z|)
##     -2.71 0.0839 -32.2 6.74e-228
## 
## Detection:
##  Estimate     SE     z  P(>|z|)
##    -0.499 0.0962 -5.19 2.15e-07
## 
## AIC: 2236.609

Parameter estimates are on the logit scale. We need to back-transform them on their natural scale, ie between 0 and 1. There are several ways to do just that.

First, inspect model coefficients:

coef(fm)

##   psi(Int)     p(Int) 
## -2.7056882 -0.4986384

Get occupancy on logit scale, then back-transform it:

logit_occupancy <- coef(fm)[1]
occupancy <- 1 / (1 + exp(-logit_occupancy))
occupancy

##   psi(Int) 
## 0.06263854

Idem for detection, note that we use the inverse logit function native to R:

logit_det <- coef(fm)[2]
detection <- plogis(logit_det)
detection

##    p(Int) 
## 0.3778607

Alternatively, you may use the unmarked function backTransform:

backTransform(fm, type ='state')

## Backtransformed linear combination(s) of Occupancy estimate(s)
## 
##  Estimate      SE LinComb (Intercept)
##    0.0626 0.00493   -2.71           1
## 
## Transformation: logistic

backTransform(fm, type ='det')

## Backtransformed linear combination(s) of Detection estimate(s)
## 
##  Estimate     SE LinComb (Intercept)
##     0.378 0.0226  -0.499           1
## 
## Transformation: logistic

And to get confidence intervals:

confint(backTransform(fm, type='state'))

##       0.025      0.975
##  0.05364514 0.07302331

confint(backTransform(fm, type='det'))

##      0.025     0.975
##  0.3346802 0.4230697

5 Deal with covariates

5.1 Site-level covariates

Here we consider site-level covariates, which do not change between visits.

Let’s fit another model with detection as a function of effort, and occupancy as a function of forest cover. We need to recreate a dataset that includes these covariates, with one column per covariate:

site.covs <- data.frame(effort = wolf_data$effort,
                        forest = wolf_data$foret)
wolf <- unmarkedFrameOccu(y = y,
                          siteCovs = site.covs)

Summarise:

summary(wolf)

## unmarkedFrame Object
## 
## 3211 sites
## Maximum number of observations per site: 4 
## Mean number of observations per site: 4 
## Sites with at least one detection: 171 
## 
## Tabulation of y observations:
##     0     1 
## 12540   304 
## 
## Site-level covariates:
##      effort            forest        
##  Min.   :-0.7836   Min.   :-1.25500  
##  1st Qu.:-0.7026   1st Qu.:-0.92319  
##  Median :-0.4190   Median :-0.09233  
##  Mean   : 0.0000   Mean   : 0.00000  
##  3rd Qu.: 0.3101   3rd Qu.: 0.75053  
##  Max.   : 4.6039   Max.   : 3.02854

Fit the model:

fm1 <- occu(~ effort ~ forest, data = wolf)

Inspect results:

fm1

## 
## Call:
## occu(formula = ~effort ~ forest, data = wolf)
## 
## Occupancy:
##             Estimate    SE     z  P(>|z|)
## (Intercept)   -1.231 0.161 -7.66 1.89e-14
## forest         0.643 0.113  5.70 1.21e-08
## 
## Detection:
##             Estimate     SE     z  P(>|z|)
## (Intercept)    -3.17 0.1573 -20.2 2.20e-90
## effort          1.12 0.0648  17.2 1.67e-66
## 
## AIC: 1946.236

Now we’d like to plot the occupancy probability estimates as a function of forest cover, and detection probability estimates as a function of effort.

First we create a grid of values for both covariates:

effort_grid <- seq(min(site.covs$effort), 
             max(site.covs$effort), 
             length = 50)
forest_grid <- seq(min(site.covs$forest), 
             max(site.covs$forest), 
             length = 50)

Then we predict detection and occupancy probabilities using the parameter estimates:

det_pred <- plogis(coef(fm1)[3] + coef(fm1)[4] * effort_grid)
occ_pred <- plogis(coef(fm1)[1] + coef(fm1)[2] * forest_grid)

Note that you can obtain the same result with function predict. E.g. let’s consider detection:

effort <- effort_grid
predict(fm1, type = "det", as.data.frame(effort))

For occupancy, you would use:

forest <- forest_grid
predict(fm1, type = "state", as.data.frame(forest))

Now we can visualise both relationships:

par(mfrow = c(1,2))
plot(forest_grid, 
     occ_pred, 
     type = "l", 
     lwd = 3, 
     xlab = "forest cover", 
     ylab = "estimated occupancy prob")
plot(effort_grid, 
     det_pred, 
     type = "l", 
     lwd = 3, 
     xlab = "effort", 
     ylab = "estimated detection prob")

5.2 Observation-level covariates

We now consider for the observation-level covariates, which can change between visits. Each covariate is its own data.frame with rows as sites and columns as repeat visits.

Let’s create a data.frame for a time covariate which we will use to assess the effect of temporal variation on detection:

time <- matrix(c('visit 1','visit 2','visit 3', 'visit 4'),
              nrow = nrow(wolf@y),
              ncol = ncol(wolf@y), 
              byrow = TRUE)

Inspect:

head(time)

##      [,1]      [,2]      [,3]      [,4]     
## [1,] "visit 1" "visit 2" "visit 3" "visit 4"
## [2,] "visit 1" "visit 2" "visit 3" "visit 4"
## [3,] "visit 1" "visit 2" "visit 3" "visit 4"
## [4,] "visit 1" "visit 2" "visit 3" "visit 4"
## [5,] "visit 1" "visit 2" "visit 3" "visit 4"
## [6,] "visit 1" "visit 2" "visit 3" "visit 4"

We recreate our dataset with this new covariate:

wolf <- unmarkedFrameOccu(y = y, # detection/non-detection data
                          siteCovs = site.covs, # site-level cov (effort and forest)
                          obsCovs = list(time = time)) # obs-level cov (time)

Summarize:

summary(wolf)

## unmarkedFrame Object
## 
## 3211 sites
## Maximum number of observations per site: 4 
## Mean number of observations per site: 4 
## Sites with at least one detection: 171 
## 
## Tabulation of y observations:
##     0     1 
## 12540   304 
## 
## Site-level covariates:
##      effort            forest        
##  Min.   :-0.7836   Min.   :-1.25500  
##  1st Qu.:-0.7026   1st Qu.:-0.92319  
##  Median :-0.4190   Median :-0.09233  
##  Mean   : 0.0000   Mean   : 0.00000  
##  3rd Qu.: 0.3101   3rd Qu.: 0.75053  
##  Max.   : 4.6039   Max.   : 3.02854  
## 
## Observation-level covariates:
##       time     
##  visit 1:3211  
##  visit 2:3211  
##  visit 3:3211  
##  visit 4:3211

Fit the model with an additive effect of effort and time on detection, and the effect of forest cover on occupancy:

fm2 <- occu(~ effort + time ~ forest, wolf)

Inspect results:

fm2

## 
## Call:
## occu(formula = ~effort + time ~ forest, data = wolf)
## 
## Occupancy:
##             Estimate    SE     z  P(>|z|)
## (Intercept)    -1.24 0.160 -7.72 1.13e-14
## forest          0.64 0.112  5.70 1.23e-08
## 
## Detection:
##             Estimate     SE      z  P(>|z|)
## (Intercept)   -3.648 0.2161 -16.88 6.39e-64
## effort         1.128 0.0654  17.24 1.45e-66
## timevisit 2    0.781 0.2075   3.76 1.68e-04
## timevisit 3    0.585 0.2109   2.77 5.55e-03
## timevisit 4    0.429 0.2142   2.01 4.49e-02
## 
## AIC: 1936.647

The parameter estimates of the relationship between occupancy probability and forest cover are similar to those obtained with a model without the time effect on detection.

Now let’s plot the effect of effort and time on detection. To do so, we may represent detection as a function of time for an average effort (which is 0 here because effort is standardised), or detection as a function of effort for the 4 visits.

Let us focus on detection as a function of effort for the 4 visits. First we create a grid of values for effort, and consider all combinations of values of effort and visits (1, 2, 3 or 4):

effort_grid <- seq(min(site.covs$effort), 
             max(site.covs$effort), 
             length = 50)
time <- c('visit 1', 'visit 2', 'visit 3', 'visit 4')
grid <- expand_grid(effort = effort_grid, time = time)

Get estimates of detection probability:

det_pred <- predict(fm2, type = "det", as.data.frame(grid))

Get vector or predicted values:

pred <- det_pred$Predicted

Visualize:

plot(effort_grid, 
     pred[grid$time == "visit 1"],
     type = "l", 
     lwd = 3, 
     xlab = "effort", 
     ylab = "estimated probability of detection",
     ylim = c(0,1))
lines(effort_grid, 
      pred[grid$time == "visit 2"],
      lwd = 3,
      col = "blue")
lines(effort_grid, 
      pred[grid$time == "visit 3"],
      lwd = 3,
      col = "pink")
lines(effort_grid, 
      pred[grid$time == "visit 4"],
      lwd = 3,
      col = "green")
legend("topleft", 
       col = c("black", "blue", "pink", "green"), 
       lwd = 3, 
       legend = c("december", "january", "february", "march"))

6 Model selection

We fitted 3 models, and we’d like to ask the question, which one is the best supported by the data. To do so, we use the AIC.

First we re-run our 3 models using the same dataset (unmarked will throw us an error message otherwise):

fm <- occu(~ 1 ~ 1, wolf)
fm1 <- occu(~ effort ~ forest, wolf)
fm2 <- occu(~ effort + time ~ forest, wolf)

Second, we gather our 3 models in a list:

fmList <- fitList('{psi, p}' = fm, 
                  '{psi(forest), p(effort)}' = fm1, 
                  '{psi(forest), p(effort + time)}' = fm2)

Last, we get a table with AIC values for each model:

modSel(fmList)

##                                 nPars     AIC  delta   AICwt cumltvWt
## {psi(forest), p(effort + time)}     7 1936.65   0.00 9.9e-01     0.99
## {psi(forest), p(effort)}            4 1946.24   9.59 8.2e-03     1.00
## {psi, p}                            2 2236.61 299.96 7.3e-66     1.00

Practical 1: Static model

Olivier Gimenez

May 2023

1 Objectives

2 Pre-requisites

3 Data

3.1 Read in data

3.2 Format data

4 Model with constant parameters

5 Deal with covariates

5.1 Site-level covariates

5.2 Observation-level covariates

6 Model selection