Class 8 live demo: Skip your coffee break: Speed up MCMC convergence

Introduction

In this demo, we illustrate how nimble can be used to speed up MCMC computations. First, we demonstrate how to change the default samplers. Then we use the nimbleEcology package which implements marginalization (by integrating or sum the likelihood over latent states). Marginalization eliminates the need for MCMC sampling these latent variables, which often provide efficiency gains. We also illustrate how to use nimble functions to write a likelihood that works with a dataset in which we pool individuals with the same encounter histories.

library(tidyverse)
library(nimbleEcology)

Loading nimbleEcology. Registering the following distributions:
 dOcc, dDynOcc, dCJS, dHMM, dDHMM, dNmixture.

library(MCMCvis)

Work with samplers: Use slice sampling

We’re going back to the Dipper example. Let’s consider a model with wing length and individual random effect on survival.

Read in data.

dipper <- read_csv(here::here("slides", "dat", "dipper.csv"))
y <- dipper %>%
  select(year_1981:year_1987) %>%
  as.matrix()

Get occasion of first capture.

first <- apply(y, 1, function(x) min(which(x !=0)))

Constants in a list.

wing.length.st <- as.vector(scale(dipper$wing_length))
my.constants <- list(N = nrow(y), 
                     T = ncol(y), 
                     first = first,
                     winglength = wing.length.st)

Data in a list.

my.data <- list(y = y + 1)

Initial values.

zinits <- y + 1 # non-detection -> alive
zinits[zinits == 2] <- 1 # dead -> alive
initial.values <- function() list(beta = rnorm(2,0,1.5),
                                  sdeps = runif(1,0,3),
                                  p = runif(1,0,1),
                                  z = zinits)

MCMC details.

parameters.to.save <- c("beta", "sdeps", "p")
n.iter <- 10000
n.burnin <- 2500
n.chains <- 2

Write the model.

hmm.phiwlrep <- nimbleCode({
    p ~ dunif(0, 1) # prior detection
    omega[1,1] <- 1 - p    # Pr(alive t -> non-detected t)
    omega[1,2] <- p        # Pr(alive t -> detected t)
    omega[2,1] <- 1        # Pr(dead t -> non-detected t)
    omega[2,2] <- 0        # Pr(dead t -> detected t)
  for (i in 1:N){
    logit(phi[i]) <- beta[1] + beta[2] * winglength[i] + eps[i]
    eps[i] ~ dnorm(mean = 0, sd = sdeps)
    gamma[1,1,i] <- phi[i]      # Pr(alive t -> alive t+1)
    gamma[1,2,i] <- 1 - phi[i]  # Pr(alive t -> dead t+1)
    gamma[2,1,i] <- 0           # Pr(dead t -> alive t+1)
    gamma[2,2,i] <- 1           # Pr(dead t -> dead t+1)
  }
  beta[1] ~ dnorm(mean = 0, sd = 1.5)
  beta[2] ~ dnorm(mean = 0, sd = 1.5)
  sdeps ~ dunif(0, 10)
  delta[1] <- 1          # Pr(alive t = 1) = 1
  delta[2] <- 0          # Pr(dead t = 1) = 0
  # likelihood
  for (i in 1:N){
    z[i,first[i]] ~ dcat(delta[1:2])
    for (j in (first[i]+1):T){
      z[i,j] ~ dcat(gamma[z[i,j-1], 1:2, i])
      y[i,j] ~ dcat(omega[z[i,j], 1:2])
    }
  }
})

Run nimble.

mcmc.phiwlrep <- nimbleMCMC(code = hmm.phiwlrep, 
                            constants = my.constants,
                            data = my.data,              
                            inits = initial.values,
                            monitors = parameters.to.save,
                            niter = n.iter,
                            nburnin = n.burnin, 
                            nchains = n.chains)

defining model...

building model...

setting data and initial values...

running calculate on model (any error reports that follow may simply reflect missing values in model variables) ... 
checking model sizes and dimensions... This model is not fully initialized. This is not an error. To see which variables are not initialized, use model$initializeInfo(). For more information on model initialization, see help(modelInitialization).
checking model calculations...

NAs were detected in model variables: eps, logProb_eps, phi, gamma, logProb_z.

model building finished.
compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
compilation finished.
running chain 1...

|-------------|-------------|-------------|-------------|
|-------------------------------------------------------|

running chain 2...

|-------------|-------------|-------------|-------------|
|-------------------------------------------------------|

Let’s have a look to the trace of the standard deviation of the random effect. Hmmm.

MCMCtrace(mcmc.phiwlrep, params = "sdeps", pdf = FALSE)

Let’s try and change the default sampler for this parameter. What are the samplers used by default?

hmm.phiwlrep <- nimbleModel(code = hmm.phiwlrep,
                            constants = my.constants,
                            data = my.data,
                            inits = initial.values())
mcmcConf <- configureMCMC(hmm.phiwlrep)

===== Monitors =====
thin = 1: p, beta, sdeps, z
===== Samplers =====
RW sampler (259)
  - p
  - beta[]  (2 elements)
  - sdeps
  - eps[]  (255 elements)
posterior_predictive sampler (78)
  - eps[]  (39 elements)
  - z[]  (39 elements)
categorical sampler (1103)
  - z[]  (1103 elements)

We remove the default sampler, and use slice sampler instead.

mcmcConf$removeSamplers('sdeps')
mcmcConf$addSampler(target = 'sdeps',
                    type = "slice")
mcmcConf

===== Monitors =====
thin = 1: p, beta, sdeps, z
===== Samplers =====
slice sampler (1)
  - sdeps
RW sampler (258)
  - p
  - beta[]  (2 elements)
  - eps[]  (255 elements)
posterior_predictive sampler (78)
  - eps[]  (39 elements)
  - z[]  (39 elements)
categorical sampler (1103)
  - z[]  (1103 elements)

Compile model and MCMC.

Rmcmc <- buildMCMC(mcmcConf)
Cmodel <- compileNimble(hmm.phiwlrep)

compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.

compilation finished.

Cmcmc <- compileNimble(Rmcmc, project = hmm.phiwlrep)

compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
compilation finished.

Run nimble.

Cmcmc$run(10000)

|-------------|-------------|-------------|-------------|
|-------------------------------------------------------|

NULL

samples1 <- as.matrix(Cmcmc$mvSamples)
Cmcmc$run(10000)

|-------------|-------------|-------------|-------------|
|-------------------------------------------------------|

NULL

samples2 <- as.matrix(Cmcmc$mvSamples)

Format results in data.frames.

df.sdeps <- data.frame(iter = c(2501:10000, 2501:10000),
                 samples = c(samples1[2501:10000,"sdeps"], samples2[2501:10000,"sdeps"]), 
                 chain = c(rep("chain 1", length(samples1[2501:10000,"sdeps"])),
                           rep("chain 2", length(samples2[2501:10000,"sdeps"]))))
df.beta <- data.frame(iter = c(2501:10000, 2501:10000),
                 beta1 = c(samples1[2501:10000,"beta[1]"], samples2[2501:10000,"beta[1]"]), 
                 beta2 = c(samples1[2501:10000,"beta[2]"], samples2[2501:10000,"beta[2]"]), 
                 chain = c(rep("chain 1", length(samples1[2501:10000,"sdeps"])),
                           rep("chain 2", length(samples2[2501:10000,"sdeps"]))))

Trace plot for the standard deviation of the random effect.

Work with samplers: Use block sampling

High correlation in (regression) parameters may make independent samplers inefficient. Let’s have a look to the correlation between the intercept and the slope of the relationship between survival and wing length in the European dipper.

There seems to be no correlation. Let’s act for a minute as if we had a strong correlation. We’re gonna see how block sampling might help. In block sampling, we propose candidate values from a multivariate distribution.

First, we remove and replace independent RW samples by block sampling. Then we proceed as usual.

mcmcConf$removeSamplers(c('beta[1]','beta[2]'))
mcmcConf$addSampler(target = c('beta[1]','beta[2]'),
                    type = "RW_block")

Note: Assigning an RW_block sampler to nodes with very different scales can result in low MCMC efficiency.  If all nodes assigned to RW_block are not on a similar scale, we recommend providing an informed value for the "propCov" control list argument, or using the AFSS sampler instead.

mcmcConf

===== Monitors =====
thin = 1: p, beta, sdeps, z
===== Samplers =====
slice sampler (1)
  - sdeps
RW_block sampler (1)
  - beta[1], beta[2] 
RW sampler (256)
  - p
  - eps[]  (255 elements)
posterior_predictive sampler (78)
  - eps[]  (39 elements)
  - z[]  (39 elements)
categorical sampler (1103)
  - z[]  (1103 elements)

Rmcmc <- buildMCMC(mcmcConf)
Cmodel <- compileNimble(hmm.phiwlrep)

compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.

compilation finished.

Cmcmc <- compileNimble(Rmcmc, project = hmm.phiwlrep)

compiling... this may take a minute. Use 'showCompilerOutput = TRUE' to see C++ compilation details.
compilation finished.

Run nimble and a single chain.

Cmcmc$run(10000)

|-------------|-------------|-------------|-------------|
|-------------------------------------------------------|

NULL

samples <- as.matrix(Cmcmc$mvSamples)

Summarize.

samples %>%
  as_tibble() %>%
  select(!starts_with("z")) %>% # ignore the latent states z
  summarise(across(everything(), list(mean = mean, sd = sd)))

# A tibble: 1 x 8
  `beta[1]_mean` `beta[1]_sd` `beta[2]_mean` `beta[2]_sd` p_mean   p_sd
           <dbl>        <dbl>          <dbl>        <dbl>  <dbl>  <dbl>
1          0.204        0.132       -0.00842        0.104  0.896 0.0370
# … with 2 more variables: sdeps_mean <dbl>, sdeps_sd <dbl>

Marginalization with nimbleEcology

Let’s get back to the analysis of the Canada geese data, with 3 sites.

geese <- read_csv("geese.csv", col_names = TRUE)


── Column specification ────────────────────────────────────────────────────────
cols(
  year_1984 = col_double(),
  year_1985 = col_double(),
  year_1986 = col_double(),
  year_1987 = col_double(),
  year_1988 = col_double(),
  year_1989 = col_double()
)

y <- as.matrix(geese)

Get the occasion of first capture for each individual.

get.first <- function(x) min(which(x != 0))
first <- apply(y, 1, get.first)

We filter out individuals that are first captured at last occasion. These individuals do not contribute to parameter estimation, and also they cause problems with nimbleEcology.

mask <- which(first!=ncol(y)) # individuals that are not first encountered at last occasion
y <- y[mask, ]                # keep only these
first <- first[mask]

Let’s write the model. Note that the likelihood is simpler due to the use of the function dHMM.

multisite.marginalized <- nimbleCode({
  
  # -------------------------------------------------
  # Parameters:
  # phiA: survival probability site A
  # phiB: survival probability site B
  # phiC: survival probability site B
  # psiAA = psiA[1]: movement probability from site A to site A (reference)
  # psiAB = psiA[2]: movement probability from site A to site B
  # psiAC = psiA[3]: movement probability from site A to site C 
  # psiBA = psiB[1]: movement probability from site B to site A
  # psiBB = psiB[2]: movement probability from site B to site B (reference)
  # psiBC = psiB[3]: movement probability from site B to site C
  # psiCA = psiC[1]: movement probability from site C to site A
  # psiCB = psiC[2]: movement probability from site C to site B
  # psiCC = psiC[3]: movement probability from site C to site C (reference)
  # pA: recapture probability site A
  # pB: recapture probability site B
  # pC: recapture probability site C
  # -------------------------------------------------
  # States (z):
  # 1 alive at A
  # 2 alive at B
  # 2 alive at C
  # 3 dead
  # Observations (y):  
  # 1 not seen
  # 2 seen at A 
  # 3 seen at B
  # 3 seen at C
  # -------------------------------------------------
  
  # survival priors
  phiA ~ dunif(0, 1)
  phiB ~ dunif(0, 1)
  phiC ~ dunif(0, 1)
  # priors for detection
  pA ~ dunif(0, 1)
  pB ~ dunif(0, 1)
  pC ~ dunif(0, 1)
  # priors for transitions: Dirichlet
  psiA[1:3] ~ ddirch(alpha[1:3])
  psiB[1:3] ~ ddirch(alpha[1:3])
  psiC[1:3] ~ ddirch(alpha[1:3])
  # probabilities of state z(t+1) given z(t)
  gamma[1,1] <- phiA * psiA[1]
  gamma[1,2] <- phiA * psiA[2]
  gamma[1,3] <- phiA * psiA[3]
  gamma[1,4] <- 1 - phiA
  gamma[2,1] <- phiB * psiB[1]
  gamma[2,2] <- phiB * psiB[2]
  gamma[2,3] <- phiB * psiB[3]
  gamma[2,4] <- 1 - phiB
  gamma[3,1] <- phiC * psiC[1]
  gamma[3,2] <- phiC * psiC[2]
  gamma[3,3] <- phiC * psiC[3]
  gamma[3,4] <- 1 - phiC
  gamma[4,1] <- 0
  gamma[4,2] <- 0
  gamma[4,3] <- 0
  gamma[4,4] <- 1
  
  # probabilities of y(t) given z(t)
  omega[1,1] <- 1 - pA     # Pr(alive A t -> non-detected t)
  omega[1,2] <- pA         # Pr(alive A t -> detected A t)
  omega[1,3] <- 0          # Pr(alive A t -> detected B t)
  omega[1,4] <- 0          # Pr(alive A t -> detected C t)
  omega[2,1] <- 1 - pB     # Pr(alive B t -> non-detected t)
  omega[2,2] <- 0          # Pr(alive B t -> detected A t)
  omega[2,3] <- pB         # Pr(alive B t -> detected B t)
  omega[2,4] <- 0          # Pr(alive B t -> detected C t)
  omega[3,1] <- 1 - pC     # Pr(alive C t -> non-detected t)
  omega[3,2] <- 0          # Pr(alive C t -> detected A t)
  omega[3,3] <- 0          # Pr(alive C t -> detected B t)
  omega[3,4] <- pC         # Pr(alive C t -> detected C t)
  omega[4,1] <- 1          # Pr(dead t -> non-detected t)
  omega[4,2] <- 0          # Pr(dead t -> detected A t)
  omega[4,3] <- 0          # Pr(dead t -> detected B t)
  omega[4,4] <- 0          # Pr(dead t -> detected C t)
  # initial state probs
  for(i in 1:N) {
    init[i, 1:4] <- gamma[ y[i, first[i] ] - 1, 1:4 ] # First state propagation
  }
    
  # likelihood 
  for (i in 1:N){
      y[i,(first[i]+1):K] ~ dHMM(init = init[i,1:4],  # count data from first[i] + 1
                                 probObs = omega[1:4,1:4],     # observation matrix
                                 probTrans = gamma[1:4,1:4],   # transition matrix
                                 len = K - first[i],           # nb of occasions
                                 checkRowSums = 0)             # do not check whether elements in a row sum tp 1
      }
})

Data in a list.

my.data <- list(y = y + 1, 
                alpha = c(1, 1, 1))

Constants in a list.

my.constants <- list(first = first, 
                     K = ncol(y), 
                     N = nrow(y))

Initial values. Note that we do not need initial values for the latent states anymore.

initial.values <- function(){list(phiA = runif(1, 0, 1), 
                                  phiB = runif(1, 0, 1), 
                                  phiC = runif(1, 0, 1), 
                                  psiA = rdirch(1, c(1,1,1)),
                                  psiB = rdirch(1, c(1,1,1)),
                                  psiC = rdirch(1, c(1,1,1)),
                                  pA = runif(1, 0, 1), 
                                  pB = runif(1, 0, 1), 
                                  pC = runif(1, 0, 1))}

Parameters to monitor.

parameters.to.save <- c("phiA", "phiB", "phiC", "psiA", "psiB", "psiC","pA", "pB", "pC")

MCMC settings.

n.iter <- 10000
n.burnin <- 5000
n.chains <- 2

Run nimble.

system.time(multisite.marginalized.out <- nimbleMCMC(code = multisite.marginalized, 
                                                     constants = my.constants,
                                                     data = my.data,              
                                                     inits = initial.values(),
                                                     monitors = parameters.to.save,
                                                     niter = n.iter,
                                                     nburnin = n.burnin, 
                                                     nchains = n.chains))

The run took around 2 minutes. For comparison, the standard formulation (see live demo for class 6 on transition estimation) took around 4 minutes.

Explore outputs.

MCMCsummary(multisite.marginalized.out, round = 2)

        mean   sd 2.5%  50% 97.5% Rhat n.eff
pA      0.51 0.08 0.36 0.51  0.69 1.00   771
pB      0.46 0.05 0.37 0.45  0.56 1.00   741
pC      0.24 0.06 0.14 0.23  0.38 1.00   542
phiA    0.61 0.05 0.51 0.61  0.71 1.00  1304
phiB    0.70 0.04 0.63 0.70  0.77 1.00  1002
phiC    0.77 0.07 0.64 0.77  0.91 1.00   865
psiA[1] 0.74 0.05 0.63 0.74  0.84 1.00  1564
psiA[2] 0.24 0.05 0.14 0.23  0.35 1.00  1460
psiA[3] 0.02 0.02 0.00 0.02  0.08 1.00  2372
psiB[1] 0.07 0.02 0.04 0.07  0.12 1.00  1250
psiB[2] 0.83 0.04 0.73 0.84  0.90 1.01   849
psiB[3] 0.10 0.04 0.04 0.09  0.19 1.02   890
psiC[1] 0.02 0.02 0.00 0.02  0.06 1.01  2321
psiC[2] 0.21 0.05 0.12 0.21  0.33 1.00  1495
psiC[3] 0.77 0.06 0.64 0.77  0.86 1.00  1359

Marginalization and weighted likelihood with nimbleEcology

In this section, we’re gonna use nimble functions to express the likelihood using pooled encounter histories. We use a vector mult that contains the number of individuals with a particular encounter history. We hacked the dHMM nimbleEcology function below.

dHMMweighted <- nimbleFunction(
  run = function (x = double(1), 
                  init = double(1), 
                  probObs = double(2),
                  probTrans = double(2), 
                  len = double(0),
                  mult = double(0), # NEWLY ADDED: argument stating number of occurrences 
                                    # of same encounter history in entire dataset 
                  checkRowSums = integer(0, default = 0),  
                  log = integer(0, default = 0)) 
  {
    if (length(x) != len) 
      nimStop("In dHMM: Argument len must be length of x or 0.")
    if (nimDim(probObs)[1] != nimDim(probTrans)[1]) 
      nimStop("In dHMM: Length of dimension 1 in probObs must equal length of dimension 1 in probTrans.")
    if (nimDim(probTrans)[1] != nimDim(probTrans)[2]) 
      nimStop("In dHMM: probTrans must be a square matrix.")
    ## There was a strict test for sum(init) == 1.  This could be true in R and false in C++!
    if (abs(sum(init) - 1) > 1e-06) 
      nimStop("In dHMM: Initial probabilities must sum to 1.")
    if (checkRowSums) {
      transCheckPasses <- TRUE
      for (i in 1:nimDim(probTrans)[1]) {
        thisCheckSum <- sum(probTrans[i, ])
        if (abs(thisCheckSum - 1) > 1e-06) {
          nimPrint("In dHMM: Problem with sum(probTrans[i,]) with i = ", 
                   i, ". The sum should be 1 but is ", thisCheckSum)
          transCheckPasses <- FALSE
        }
      }
      obsCheckPasses <- TRUE
      for (i in 1:nimDim(probObs)[1]) {
        thisCheckSum <- sum(probObs[i, ])
        if (abs(thisCheckSum - 1) > 1e-06) {
          nimPrint("In dHMM: Problem with sum(probObs[i,]) with i = ", 
                   i, ". The sum should be 1 but is ", thisCheckSum)
          obsCheckPasses <- FALSE
        }
      }
      if (!(transCheckPasses | obsCheckPasses)) 
        nimStop("In dHMM: probTrans and probObs were not specified correctly.  Probabilities in each row (second dimension) must sum to 1.")
      if (!transCheckPasses) 
        nimStop("In dHMM: probTrans was not specified correctly.  Probabilities in each row (second dimension) must sum to 1.")
      if (!obsCheckPasses) 
        nimStop("In dHMM: probObs was not specified correctly. Probabilities in each row must sum to 1.")
    }
    pi <- init
    logL <- 0
    nObsClasses <- nimDim(probObs)[2]
    for (t in 1:len) {
      if (x[t] > nObsClasses | x[t] < 1) 
        nimStop("In dHMM: Invalid value of x[t].")
      Zpi <- probObs[, x[t]] * pi
      sumZpi <- sum(Zpi)
      logL <- logL + log(sumZpi) * mult # NEWLY ADDED
      if (t != len) 
        pi <- ((Zpi %*% probTrans)/sumZpi)[1, ]
    }
    if (log) 
      return(logL)
    return(exp(logL))
    returnType(double())
  })

Write the model.

multisite.marginalized <- nimbleCode({
  
  # -------------------------------------------------
  # Parameters:
  # phiA: survival probability site A
  # phiB: survival probability site B
  # phiC: survival probability site B
  # psiAA = psiA[1]: movement probability from site A to site A (reference)
  # psiAB = psiA[2]: movement probability from site A to site B
  # psiAC = psiA[3]: movement probability from site A to site C 
  # psiBA = psiB[1]: movement probability from site B to site A
  # psiBB = psiB[2]: movement probability from site B to site B (reference)
  # psiBC = psiB[3]: movement probability from site B to site C
  # psiCA = psiC[1]: movement probability from site C to site A
  # psiCB = psiC[2]: movement probability from site C to site B
  # psiCC = psiC[3]: movement probability from site C to site C (reference)
  # pA: recapture probability site A
  # pB: recapture probability site B
  # pC: recapture probability site C
  # -------------------------------------------------
  # States (z):
  # 1 alive at A
  # 2 alive at B
  # 2 alive at C
  # 3 dead
  # Observations (y):  
  # 1 not seen
  # 2 seen at A 
  # 3 seen at B
  # 3 seen at C
  # -------------------------------------------------
  
  # survival priors
  phiA ~ dunif(0, 1)
  phiB ~ dunif(0, 1)
  phiC ~ dunif(0, 1)
  # detection priors
  pA ~ dunif(0, 1)
  pB ~ dunif(0, 1)
  pC ~ dunif(0, 1)
  # transition priors: Dirichlet
  psiA[1:3] ~ ddirch(alpha[1:3])
  psiB[1:3] ~ ddirch(alpha[1:3])
  psiC[1:3] ~ ddirch(alpha[1:3])
  gamma[1,1] <- phiA * psiA[1]
  gamma[1,2] <- phiA * psiA[2]
  gamma[1,3] <- phiA * psiA[3]
  gamma[1,4] <- 1 - phiA
  gamma[2,1] <- phiB * psiB[1]
  gamma[2,2] <- phiB * psiB[2]
  gamma[2,3] <- phiB * psiB[3]
  gamma[2,4] <- 1 - phiB
  gamma[3,1] <- phiC * psiC[1]
  gamma[3,2] <- phiC * psiC[2]
  gamma[3,3] <- phiC * psiC[3]
  gamma[3,4] <- 1 - phiC
  gamma[4,1] <- 0
  gamma[4,2] <- 0
  gamma[4,3] <- 0
  gamma[4,4] <- 1
  
  # probabilities of y(t) given z(t)
  omega[1,1] <- 1 - pA     # Pr(alive A t -> non-detected t)
  omega[1,2] <- pA         # Pr(alive A t -> detected A t)
  omega[1,3] <- 0          # Pr(alive A t -> detected B t)
  omega[1,4] <- 0          # Pr(alive A t -> detected C t)
  omega[2,1] <- 1 - pB     # Pr(alive B t -> non-detected t)
  omega[2,2] <- 0          # Pr(alive B t -> detected A t)
  omega[2,3] <- pB         # Pr(alive B t -> detected B t)
  omega[2,4] <- 0          # Pr(alive B t -> detected C t)
  omega[3,1] <- 1 - pC     # Pr(alive C t -> non-detected t)
  omega[3,2] <- 0          # Pr(alive C t -> detected A t)
  omega[3,3] <- 0          # Pr(alive C t -> detected B t)
  omega[3,4] <- pC         # Pr(alive C t -> detected C t)
  omega[4,1] <- 1          # Pr(dead t -> non-detected t)
  omega[4,2] <- 0          # Pr(dead t -> detected A t)
  omega[4,3] <- 0          # Pr(dead t -> detected B t)
  omega[4,4] <- 0          # Pr(dead t -> detected C t)
  
  for(i in 1:N) {
    init[i, 1:4] <- gamma[ y[i, first[i] ] - 1, 1:4 ] # First state propagation
    }
  
  # likelihood 
  for (i in 1:N){
    y[i,(first[i]+1):K] ~ dHMMweighted(init = init[i,1:4], # count data from first[i] + 1
                                       mult = mult[i],
                                       probObs = omega[1:4,1:4],
                                       probTrans = gamma[1:4,1:4],
                                       len = K - first[i],
                                       checkRowSums = 0)
  }
})

We need to pool the individual encounter histories by unique encounter histories, and to record the number of individuals with a particular unique encounter history.

geese <- read_csv("geese.csv", col_names = TRUE)


── Column specification ────────────────────────────────────────────────────────
cols(
  year_1984 = col_double(),
  year_1985 = col_double(),
  year_1986 = col_double(),
  year_1987 = col_double(),
  year_1988 = col_double(),
  year_1989 = col_double()
)

y <- as.matrix(geese)
y_weighted <- y %>% 
  as_tibble() %>% 
  group_by_all() %>% 
  summarise(mult = n()) %>% 
  relocate(mult) %>% 
  as.matrix()

`summarise()` has grouped output by 'year_1984', 'year_1985', 'year_1986', 'year_1987', 'year_1988'. You can override using the `.groups` argument.

head(y_weighted)

     mult year_1984 year_1985 year_1986 year_1987 year_1988 year_1989
[1,]    5         0         0         0         0         0         1
[2,]    8         0         0         0         0         0         2
[3,]    1         0         0         0         0         0         3
[4,]    6         0         0         0         0         1         0
[5,]    2         0         0         0         0         1         1
[6,]   27         0         0         0         0         2         0

mult <- y_weighted[,1] # nb of individuals w/ a particular encounter history
y <- y_weighted[,-1] # pooled data

There are 5 individuals that were detected only once, in site A in year 1989, 8 individuals that were detected only once, in 1989 in site B, and so on.

Get the occasion of first capture for each history.

get.first <- function(x) min(which(x != 0))
first <- apply(y, 1, get.first)

Filter out individuals that are first captured at last occasion.

mask <- which(first!=ncol(y))
y <- y[mask, ]

Apply filter on occasion of first capture and sample size.

first <- first[mask]
mult <- mult[mask]

Data in a list.

my.data <- list(y = y + 1, 
                alpha = c(1, 1, 1))

Constants in a list.

my.constants <- list(first = first, 
                     K = ncol(y), 
                     N = nrow(y),
                     mult = mult)

Initial values. Note that we do not need initial values for the latent states anymore.

initial.values <- function(){list(phiA = runif(1, 0, 1), 
                                  phiB = runif(1, 0, 1), 
                                  phiC = runif(1, 0, 1), 
                                  psiA = rdirch(1, c(1,1,1)),
                                  psiB = rdirch(1, c(1,1,1)),
                                  psiC = rdirch(1, c(1,1,1)),
                                  pA = runif(1, 0, 1), 
                                  pB = runif(1, 0, 1), 
                                  pC = runif(1, 0, 1))}

Parameters to monitor.

parameters.to.save <- c("phiA", "phiB", "phiC", "psiA", "psiB", "psiC","pA", "pB", "pC")

MCMC settings.

n.iter <- 10000
n.burnin <- 5000
n.chains <- 2

Run nimble.

system.time(multisite.marginalized.out <- nimbleMCMC(code = multisite.marginalized, 
                                                     constants = my.constants,
                                                     data = my.data,              
                                                     inits = initial.values(),
                                                     monitors = parameters.to.save,
                                                     niter = n.iter,
                                                     nburnin = n.burnin, 
                                                     nchains = n.chains))

This run won’t work cause we need a rHMMweighted as well.

rHMMweighted <- nimbleFunction(
  run = function(n = integer(),    ## Observed capture (state) history
                 init = double(1),
                 probObs = double(2),
                 probTrans = double(2),
                 len = double(0, default = 0),
                 mult = double(0),
                 checkRowSums = double(0, default = 1)) {
    returnType(double(1))
    if (dim(probObs)[1] != dim(probTrans)[1]) stop("In rHMM: Number of cols in probObs must equal number of cols in probTrans.")
    if (dim(probTrans)[1] != dim(probTrans)[2]) stop("In rHMM: probTrans must be a square matrix.")
    if (abs(sum(init) - 1) > 1e-06) stop("In rHMM: Initial probabilities must sum to 1.")
    if (checkRowSums) {
      transCheckPasses <- TRUE
      for (i in 1:dim(probTrans)[1]) {
        thisCheckSum <- sum(probTrans[i,])
        if (abs(thisCheckSum - 1) > 1e-6) {
          ## Compilation doesn't support more than a simple string for stop()
          ## so we provide more detail using a print().
          print("In rHMM: Problem with sum(probTrans[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
          transCheckPasses <- FALSE
        }
      }
      obsCheckPasses <- TRUE
      for (i in 1:dim(probObs)[1]) {
        thisCheckSum <- sum(probObs[i,])
        if (abs(thisCheckSum - 1) > 1e-6) {
          print("In rHMM: Problem with sum(probObs[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
          obsCheckPasses <- FALSE
        }
      }
      if(!(transCheckPasses | obsCheckPasses))
        stop("In rHMM: probTrans and probObs were not specified correctly.  Probabilities in each row (second dimension) must sum to 1.")
      if(!transCheckPasses)
        stop("In rHMM: probTrans was not specified correctly.  Probabilities in each row (second dimension) must sum to 1.")
      if(!obsCheckPasses)
        stop("In rHMM: probObs was not specified correctly. Probabilities in each row must sum to 1.")
    }
    ans <- numeric(len)
    probInit <- init
    trueInit <- 0
    r <- runif(1, 0, 1)
    j <- 1
    while (r > sum(probInit[1:j])) j <- j + 1
    trueInit <- j
    trueState <- trueInit
    for (i in 1:len) {
      # Transition to a new true state
      r <- runif(1, 0, 1)
      j <- 1
      while (r > sum(probTrans[trueState, 1:j])) j <- j + 1
      trueState <- j
      # Detect based on the true state
      r <- runif(1, 0, 1)
      j <- 1
      while (r > sum(probObs[trueState, 1:j])) j <- j + 1
      ans[i] <- j
    }
    return(ans)
  })

Run nimble again.

system.time(multisite.marginalized.out <- nimbleMCMC(code = multisite.marginalized, 
                                                     constants = my.constants,
                                                     data = my.data,              
                                                     inits = initial.values(),
                                                     monitors = parameters.to.save,
                                                     niter = n.iter,
                                                     nburnin = n.burnin, 
                                                     nchains = n.chains))

Wow, this run took 1.5 minute, even faster than the marginalized version of the likelihood. The outputs are similar.

MCMCsummary(multisite.marginalized.out, round = 2)

        mean   sd 2.5%  50% 97.5% Rhat n.eff
pA      0.51 0.08 0.36 0.51  0.68 1.00   894
pB      0.45 0.05 0.36 0.45  0.55 1.02   844
pC      0.24 0.06 0.14 0.24  0.37 1.01   654
phiA    0.60 0.05 0.51 0.60  0.71 1.00  1086
phiB    0.70 0.04 0.63 0.70  0.77 1.01  1348
phiC    0.77 0.07 0.64 0.77  0.90 1.01   782
psiA[1] 0.74 0.06 0.62 0.75  0.84 1.01  1727
psiA[2] 0.24 0.05 0.14 0.23  0.35 1.01  1674
psiA[3] 0.02 0.02 0.00 0.02  0.08 1.00  2078
psiB[1] 0.07 0.02 0.04 0.07  0.12 1.00  1665
psiB[2] 0.84 0.04 0.75 0.84  0.90 1.01  1055
psiB[3] 0.09 0.03 0.04 0.09  0.17 1.01  1079
psiC[1] 0.02 0.01 0.00 0.02  0.06 1.00  2545
psiC[2] 0.21 0.05 0.12 0.21  0.33 1.01  1745
psiC[3] 0.77 0.05 0.65 0.77  0.86 1.01  1649

Analysis of the whole Canada geese dataset

So far we’ve worked with a subset of the original dataset. Fitting the same model to the whole data would be difficult, if not impossible. Let’s try it with the weighted likelihood. We first read in the data.

geese <- read_csv2("allgeese.csv", col_names = TRUE)

ℹ Using ',' as decimal and '.' as grouping mark. Use `read_delim()` for more control.

Warning: Duplicated column names deduplicated: '0' => '0_1' [2], '0' =>
'0_2' [3], '0' => '0_3' [4], '0' => '0_4' [5]


── Column specification ────────────────────────────────────────────────────────
cols(
  `0` = col_double(),
  `0_1` = col_double(),
  `0_2` = col_double(),
  `0_3` = col_double(),
  `0_4` = col_double(),
  `1` = col_double(),
  `158` = col_double()
)

geese <- as.matrix(geese)
y <- geese[,-7]
mult <- geese[,7]

In total, we have 2.1277^{4} banded geese, and only 622 unique capture histories.

Get the occasion of first capture for each individual

get.first <- function(x) min(which(x != 0))
first <- apply(y, 1, get.first)

Filter out individuals that are first captured at last occasion

mask <- which(first!=ncol(y))
y <- y[mask, ]

Recalculate occasion of first capture and sample size

first <- first[mask]
mult <- mult[mask]

Data in list.

my.data <- list(y = y + 1, 
                alpha = c(1, 1, 1))

Constant in a list.

my.constants <- list(first = first, 
                     K = ncol(y), 
                     N = nrow(y),
                     mult = mult)

Initial values.

initial.values <- function(){list(phiA = runif(1, 0, 1), 
                                  phiB = runif(1, 0, 1), 
                                  phiC = runif(1, 0, 1), 
                                  psiA = rdirch(1, c(1,1,1)),
                                  psiB = rdirch(1, c(1,1,1)),
                                  psiC = rdirch(1, c(1,1,1)),
                                  pA = runif(1, 0, 1), 
                                  pB = runif(1, 0, 1), 
                                  pC = runif(1, 0, 1))}

MCMC settings.

parameters.to.save <- c("phiA", "phiB", "phiC", "psiA", "psiB", "psiC","pA", "pB", "pC")
n.iter <- 10000
n.burnin <- 5000
n.chains <- 2

Run nimble.

system.time(multisite.marginalized.out <- nimbleMCMC(code = multisite.marginalized, 
                                                     constants = my.constants,
                                                     data = my.data,              
                                                     inits = initial.values(),
                                                     monitors = parameters.to.save,
                                                     niter = n.iter,
                                                     nburnin = n.burnin, 
                                                     nchains = n.chains))

It would take ages to run the same model on the > 20,000 individual encounter histories. Not even sure it can be run.

Let’s inspect the results.

MCMCsummary(multisite.marginalized.out, round = 2)

        mean   sd 2.5%  50% 97.5% Rhat n.eff
pA      0.47 0.01 0.45 0.47  0.50 1.01   766
pB      0.41 0.01 0.39 0.41  0.42 1.00   809
pC      0.34 0.01 0.31 0.34  0.37 1.00   800
phiA    0.65 0.01 0.64 0.65  0.67 1.00  1216
phiB    0.68 0.01 0.67 0.68  0.70 1.00   985
phiC    0.67 0.01 0.65 0.67  0.69 1.00  1041
psiA[1] 0.73 0.01 0.72 0.73  0.75 1.00  1516
psiA[2] 0.26 0.01 0.24 0.26  0.28 1.00  1540
psiA[3] 0.01 0.00 0.00 0.01  0.01 1.00  2150
psiB[1] 0.11 0.00 0.10 0.11  0.12 1.00  1240
psiB[2] 0.87 0.00 0.86 0.87  0.88 1.00  1261
psiB[3] 0.03 0.00 0.02 0.03  0.03 1.00  1793
psiC[1] 0.05 0.00 0.04 0.05  0.06 1.00  2188
psiC[2] 0.26 0.01 0.24 0.26  0.28 1.00  1446
psiC[3] 0.70 0.01 0.67 0.70  0.72 1.00  1471

Class 8 live demo: Skip your coffee break: Speed up MCMC convergence

The team

last updated: 2021-05-12

Introduction

Work with samplers: Use slice sampling

Work with samplers: Use block sampling

Marginalization with nimbleEcology

Marginalization and weighted likelihood with nimbleEcology

Analysis of the whole Canada geese dataset