Load some packages:
library(tidyverse) # manipulate and visualize data
theme_set(theme_light(base_size = 16)) # set theme for visualisation and font size
library(unmarked) # occupancy modelling
We use data from Mapping and explaining wolf recolonization in France using dynamic occupancy models and opportunistic data by Louvrier et al. in Ecography available at https://onlinelibrary.wiley.com/doi/abs/10.1111/ecog.02874. We use the year 2013 for illustration. Occasions are months December, January, February and March. A 100km\(^2\) grid was overlaid over France and detections and non-detections were collected. See paper for more details.
Read in data:
<- read_csv("data/wolf_occupancy.csv") wolf_data
Inspect data. The column names are self-explanatory. Note that covariates effort (number of observers in a given cell), altitude and forest were standardized.
wolf_data
## # A tibble: 3,211 × 10
## idsite occ1 occ2 occ3 occ4 effort altitude foret X Y
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 8 0 0 0 0 -0.784 -1.03 -1.25 585000 6115000
## 2 9 0 0 0 0 -0.784 1.51 -1.25 595000 6115000
## 3 10 0 0 0 0 -0.784 0.808 -1.25 605000 6115000
## 4 11 0 0 0 0 -0.743 0.675 -1.25 615000 6115000
## 5 12 0 0 0 0 -0.703 0.683 -1.25 625000 6115000
## 6 13 0 0 0 0 -0.703 0.507 -1.25 635000 6115000
## 7 14 0 0 0 0 -0.662 -1.03 -1.25 645000 6115000
## 8 15 0 0 0 0 -0.703 -1.03 -1.25 655000 6115000
## 9 16 0 0 0 0 -0.743 -1.03 -1.25 665000 6115000
## 10 68 0 0 0 0 -0.784 2.02 -1.25 585000 6125000
## # ℹ 3,201 more rows
There are > 3000 cells.
nrow(wolf_data)
## [1] 3211
We pick the detections and non-detections, and format them for use in
unmarked
:
<- wolf_data[,2:5]
y <- unmarkedFrameOccu(y = y) wolf
Inspect:
head(wolf)
## Data frame representation of unmarkedFrame object.
## y.1 y.2 y.3 y.4
## 1 0 0 0 0
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 0
## 6 0 0 0 0
## 7 0 0 0 0
## 8 0 0 0 0
## 9 0 0 0 0
## 10 0 0 0 0
Summarise data:
summary(wolf)
## unmarkedFrame Object
##
## 3211 sites
## Maximum number of observations per site: 4
## Mean number of observations per site: 4
## Sites with at least one detection: 171
##
## Tabulation of y observations:
## 0 1
## 12540 304
Visualize:
plot(wolf)
Fit a model with constant parameters, detection first, then occupancy:
<- occu(~ 1 ~ 1, data = wolf) fm
Inspect results:
fm
##
## Call:
## occu(formula = ~1 ~ 1, data = wolf)
##
## Occupancy:
## Estimate SE z P(>|z|)
## -2.71 0.0839 -32.2 6.74e-228
##
## Detection:
## Estimate SE z P(>|z|)
## -0.499 0.0962 -5.19 2.15e-07
##
## AIC: 2236.609
Parameter estimates are on the logit scale. We need to back-transform them on their natural scale, ie between 0 and 1. There are several ways to do just that.
First, inspect model coefficients:
coef(fm)
## psi(Int) p(Int)
## -2.7056882 -0.4986384
Get occupancy on logit scale, then back-transform it:
<- coef(fm)[1]
logit_occupancy <- 1 / (1 + exp(-logit_occupancy))
occupancy occupancy
## psi(Int)
## 0.06263854
Idem for detection, note that we use the inverse logit function native to R:
<- coef(fm)[2]
logit_det <- plogis(logit_det)
detection detection
## p(Int)
## 0.3778607
Alternatively, you may use the unmarked
function
backTransform
:
backTransform(fm, type ='state')
## Backtransformed linear combination(s) of Occupancy estimate(s)
##
## Estimate SE LinComb (Intercept)
## 0.0626 0.00493 -2.71 1
##
## Transformation: logistic
backTransform(fm, type ='det')
## Backtransformed linear combination(s) of Detection estimate(s)
##
## Estimate SE LinComb (Intercept)
## 0.378 0.0226 -0.499 1
##
## Transformation: logistic
And to get confidence intervals:
confint(backTransform(fm, type='state'))
## 0.025 0.975
## 0.05364514 0.07302331
confint(backTransform(fm, type='det'))
## 0.025 0.975
## 0.3346802 0.4230697
Here we consider site-level covariates, which do not change between visits.
Let’s fit another model with detection as a function of effort, and occupancy as a function of forest cover. We need to recreate a dataset that includes these covariates, with one column per covariate:
<- data.frame(effort = wolf_data$effort,
site.covs forest = wolf_data$foret)
<- unmarkedFrameOccu(y = y,
wolf siteCovs = site.covs)
Summarise:
summary(wolf)
## unmarkedFrame Object
##
## 3211 sites
## Maximum number of observations per site: 4
## Mean number of observations per site: 4
## Sites with at least one detection: 171
##
## Tabulation of y observations:
## 0 1
## 12540 304
##
## Site-level covariates:
## effort forest
## Min. :-0.7836 Min. :-1.25500
## 1st Qu.:-0.7026 1st Qu.:-0.92319
## Median :-0.4190 Median :-0.09233
## Mean : 0.0000 Mean : 0.00000
## 3rd Qu.: 0.3101 3rd Qu.: 0.75053
## Max. : 4.6039 Max. : 3.02854
Fit the model:
<- occu(~ effort ~ forest, data = wolf) fm1
Inspect results:
fm1
##
## Call:
## occu(formula = ~effort ~ forest, data = wolf)
##
## Occupancy:
## Estimate SE z P(>|z|)
## (Intercept) -1.231 0.161 -7.66 1.89e-14
## forest 0.643 0.113 5.70 1.21e-08
##
## Detection:
## Estimate SE z P(>|z|)
## (Intercept) -3.17 0.1573 -20.2 2.20e-90
## effort 1.12 0.0648 17.2 1.67e-66
##
## AIC: 1946.236
Now we’d like to plot the occupancy probability estimates as a function of forest cover, and detection probability estimates as a function of effort.
First we create a grid of values for both covariates:
<- seq(min(site.covs$effort),
effort_grid max(site.covs$effort),
length = 50)
<- seq(min(site.covs$forest),
forest_grid max(site.covs$forest),
length = 50)
Then we predict detection and occupancy probabilities using the parameter estimates:
<- plogis(coef(fm1)[3] + coef(fm1)[4] * effort_grid)
det_pred <- plogis(coef(fm1)[1] + coef(fm1)[2] * forest_grid) occ_pred
Note that you can obtain the same result with function
predict
. E.g. let’s consider detection:
<- effort_grid
effort predict(fm1, type = "det", as.data.frame(effort))
For occupancy, you would use:
<- forest_grid
forest predict(fm1, type = "state", as.data.frame(forest))
Now we can visualise both relationships:
par(mfrow = c(1,2))
plot(forest_grid,
occ_pred, type = "l",
lwd = 3,
xlab = "forest cover",
ylab = "estimated occupancy prob")
plot(effort_grid,
det_pred, type = "l",
lwd = 3,
xlab = "effort",
ylab = "estimated detection prob")
We now consider for the observation-level covariates, which can change between visits. Each covariate is its own data.frame with rows as sites and columns as repeat visits.
Let’s create a data.frame for a time covariate which we will use to assess the effect of temporal variation on detection:
<- matrix(c('visit 1','visit 2','visit 3', 'visit 4'),
time nrow = nrow(wolf@y),
ncol = ncol(wolf@y),
byrow = TRUE)
Inspect:
head(time)
## [,1] [,2] [,3] [,4]
## [1,] "visit 1" "visit 2" "visit 3" "visit 4"
## [2,] "visit 1" "visit 2" "visit 3" "visit 4"
## [3,] "visit 1" "visit 2" "visit 3" "visit 4"
## [4,] "visit 1" "visit 2" "visit 3" "visit 4"
## [5,] "visit 1" "visit 2" "visit 3" "visit 4"
## [6,] "visit 1" "visit 2" "visit 3" "visit 4"
We recreate our dataset with this new covariate:
<- unmarkedFrameOccu(y = y, # detection/non-detection data
wolf siteCovs = site.covs, # site-level cov (effort and forest)
obsCovs = list(time = time)) # obs-level cov (time)
Summarize:
summary(wolf)
## unmarkedFrame Object
##
## 3211 sites
## Maximum number of observations per site: 4
## Mean number of observations per site: 4
## Sites with at least one detection: 171
##
## Tabulation of y observations:
## 0 1
## 12540 304
##
## Site-level covariates:
## effort forest
## Min. :-0.7836 Min. :-1.25500
## 1st Qu.:-0.7026 1st Qu.:-0.92319
## Median :-0.4190 Median :-0.09233
## Mean : 0.0000 Mean : 0.00000
## 3rd Qu.: 0.3101 3rd Qu.: 0.75053
## Max. : 4.6039 Max. : 3.02854
##
## Observation-level covariates:
## time
## visit 1:3211
## visit 2:3211
## visit 3:3211
## visit 4:3211
Fit the model with an additive effect of effort and time on detection, and the effect of forest cover on occupancy:
<- occu(~ effort + time ~ forest, wolf) fm2
Inspect results:
fm2
##
## Call:
## occu(formula = ~effort + time ~ forest, data = wolf)
##
## Occupancy:
## Estimate SE z P(>|z|)
## (Intercept) -1.24 0.160 -7.72 1.13e-14
## forest 0.64 0.112 5.70 1.23e-08
##
## Detection:
## Estimate SE z P(>|z|)
## (Intercept) -3.648 0.2161 -16.88 6.39e-64
## effort 1.128 0.0654 17.24 1.45e-66
## timevisit 2 0.781 0.2075 3.76 1.68e-04
## timevisit 3 0.585 0.2109 2.77 5.55e-03
## timevisit 4 0.429 0.2142 2.01 4.49e-02
##
## AIC: 1936.647
The parameter estimates of the relationship between occupancy probability and forest cover are similar to those obtained with a model without the time effect on detection.
Now let’s plot the effect of effort and time on detection. To do so, we may represent detection as a function of time for an average effort (which is 0 here because effort is standardised), or detection as a function of effort for the 4 visits.
Let us focus on detection as a function of effort for the 4 visits. First we create a grid of values for effort, and consider all combinations of values of effort and visits (1, 2, 3 or 4):
<- seq(min(site.covs$effort),
effort_grid max(site.covs$effort),
length = 50)
<- c('visit 1', 'visit 2', 'visit 3', 'visit 4')
time <- expand_grid(effort = effort_grid, time = time) grid
Get estimates of detection probability:
<- predict(fm2, type = "det", as.data.frame(grid)) det_pred
Get vector or predicted values:
<- det_pred$Predicted pred
Visualize:
plot(effort_grid,
$time == "visit 1"],
pred[gridtype = "l",
lwd = 3,
xlab = "effort",
ylab = "estimated probability of detection",
ylim = c(0,1))
lines(effort_grid,
$time == "visit 2"],
pred[gridlwd = 3,
col = "blue")
lines(effort_grid,
$time == "visit 3"],
pred[gridlwd = 3,
col = "pink")
lines(effort_grid,
$time == "visit 4"],
pred[gridlwd = 3,
col = "green")
legend("topleft",
col = c("black", "blue", "pink", "green"),
lwd = 3,
legend = c("december", "january", "february", "march"))
We fitted 3 models, and we’d like to ask the question, which one is the best supported by the data. To do so, we use the AIC.
First we re-run our 3 models using the same dataset
(unmarked
will throw us an error message otherwise):
<- occu(~ 1 ~ 1, wolf)
fm <- occu(~ effort ~ forest, wolf)
fm1 <- occu(~ effort + time ~ forest, wolf) fm2
Second, we gather our 3 models in a list:
<- fitList('{psi, p}' = fm,
fmList '{psi(forest), p(effort)}' = fm1,
'{psi(forest), p(effort + time)}' = fm2)
Last, we get a table with AIC values for each model:
modSel(fmList)
## nPars AIC delta AICwt cumltvWt
## {psi(forest), p(effort + time)} 7 1936.65 0.00 9.9e-01 0.99
## {psi(forest), p(effort)} 4 1946.24 9.59 8.2e-03 1.00
## {psi, p} 2 2236.61 299.96 7.3e-66 1.00