7 Lack of fit
WORK IN PROGRESS
7.1 Individual heterogeneity
On wolf, see Cubaynes et al. (2010), Gimenez and Choquet (2010), or go full non-parametric w/ Turek, Wehrhahn, and Gimenez (2021). See Pradel (2009) for black-headed gull example.
Our example is about individual heterogeneity and how to account for it with HMMs. Gray wolf is a social species with hierarchy in packs which may reflect in species demography. As an example, we’ll work with gray wolves.

Figure 7.1: Dominance in wolves.
Gray wolf is a social species with hierarchy in packs which may reflect in demography. Shirley Pledger in a series of papers developed heterogeneity models in which individuals are assigned in two or more classes with class-specific survival/detection probabilities. Cubaynes et al. (2010) used HMMs to account for heterogeneity in the detection process due to social status, see also Pradel (2009). Dominant individuals tend to use path more often than others, and these paths are where we look for scats.
Individual heterogeneity
3 states
alive in class 1 (A1)
alive in class 2 (A2)
dead (D)
2 observations
not captured (1)
captured (2)
Vector of initial state probabilities
\[\begin{matrix} & \\ \mathbf{\delta} = \left ( \vphantom{ \begin{matrix} 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=A1 & z_t=A2 & z_t=D \\ \hdashline \pi & 1 - \pi & 0\\ \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \end{matrix} } \right ) \begin{matrix} \end{matrix} \end{matrix}\]
\(\pi\) is the probability of being alive in class 1. \(1 - \pi\) is the probability of being in class 2.
Transition matrix
\[\begin{matrix} & \\ \mathbf{\Gamma} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=A1 & z_t=A2 & z_t=D \\ \hdashline \phi & 0 & 1 - \phi\\ 0 & \phi & 1 - \phi\\ 0 & 0 & 1 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right ) \begin{matrix} z_{t-1}=A1 \\ z_{t-1}=A2 \\ z_{t-1}=D \end{matrix} \end{matrix}\]
\(\phi\) is the survival probability, which could be made heterogeneous.
Transition matrix, with change in heterogeneity class
\[\begin{matrix} & \\ \mathbf{\Gamma} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=A1 & z_t=A2 & z_t=D \\ \hdashline \phi (1-\psi^{12}) & \phi \psi^{12} & 1 - \phi\\ \phi \psi^{21} & \phi (1-\psi^{21}) & 1 - \phi\\ 0 & 0 & 1 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right ) \begin{matrix} z_{t-1}=A1 \\ z_{t-1}=A2 \\ z_{t-1}=D \end{matrix} \end{matrix}\]
\(\psi^{12}\) is the probability for an individual to change class of heterogeneity, from 1 to 2. \(\psi^{21}\) is the probability for an individual to change class of heterogeneity, from 2 to 1.
Observation matrix
\[\begin{matrix} & \\ \mathbf{\Omega} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12\end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} y_t=1 & y_t=2\\ \hdashline 1 - p^1 & p^1\\ 1 - p^2 & p^2\\ 1 & 0 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12\end{matrix} } \right ) \begin{matrix} z_{t}=A1 \\ z_{t}=A2 \\ z_{t}=D \end{matrix} \end{matrix}\]
\(p^1\) is detection for individuals in class 1, and \(p^2\) that of individuals in class 2.
Results
## mean sd 2.5% 50% 97.5% Rhat n.eff
## p1 0.38 0.09 0.23 0.38 0.56 1.04 210
## p2 0.50 0.12 0.25 0.50 0.73 1.01 229
## phi 0.81 0.05 0.71 0.81 0.91 1.04 317
## pi 0.62 0.12 0.36 0.63 0.83 1.02 164
We have lowly detectable individuals (class A1 with \(p^1\)) in proportion 62%. And highly (or so) detectable individuals (class A2 with \(p^2\)) in proportion 38%. Note that interpretation of classes is made a posteriori. Survival is 81%.

From the simulations I run, seems like the categorical sampler on latent states gets stuck in places that depend on initial values. Changing for the slice sampler improves thing a bit, but not that much. Only option is to get rid of the states and use the marginalized likelihood. Nice illustration of the use of simulations (to check model is doing ok, estimated are valid, etc.), changing samplers, nimbleEcology, NIMBLE functions, etc.
You may consider more classes, and select among models, see Cubaynes et al. (2012). You may also go for a non-parametric approach and let the data tell you how many classes you need. This is relatively easy to do in NIMBLE, see Turek, Wehrhahn, and Gimenez (2021). More about individual heterogeneity in Gimenez, Cam, and Gaillard (2018).
Bottom line, the way this model is written is completely destined for failure. The state-space formulation that’s written, with the two different groups for individuals (high detection and low detection) sets the model up for failure, and it will never work correctly when written this way. Briefly:
For any individual that is seen at any time t > 1 (that is, individuals seen again after the first sighting), the detection history looks something like: 1 0 0 1 ….. (1 represents detection). The initial values for the latent z state are either : 1 1 1 1 1….. or 2 2 2 2 2 putting that individual into one of the two groups (1 or 2 always). when sampling, the latent z can never transition to the other group, from group 2 to group 1, or from group 1 to group 2. It will be stuck where ever it started. If the categorical sampler tries to change the final state from the initial value of (say) 2 to 1, then this transition is deemed to be impossible by the prior (defined by the gamma state transition matrix), since the state at the previous time was 2, and 2 in one period (the previous period) does not permit a state of 1 in the next time period. SImilarly, if the first (or any intermediate) value of z attempts to transition from (say) 2 to 1, then the following state is still 1, and that dependency does not allow the state in question to change to 1, because a state of 1 cannot have the next state be 2. Even if some “dead” states (3’s) are added to the end of the z vector over the course of MCMC sampling, and say z becomes: 1 1 1 1 3 3 The 3’s can never propagate “earlier” than shown here (since there are detections at t=1 and t=4, so the individual cannot be in state 3 at time t=4), so the problem described above will always be the case, and this individual will always remain in group 1, no matter how long you run the MCMC.
The only time an individual (with the model written as such) could change between groups (1 -> 2, or 2 -> 1) from their initial group assignment of zinit, would be if the individual is only observed on the first time period, the detection history is: 1 0 0 0 0 0, Then say the initial value for z is: 1 1 1 1 1 1 (always in group 1), then the sampler at some point could begin transitioning the final 1 into state 3 (dead), so after an MCMC iteration, the z for this individual could be: 1 1 1 1 1 3, then if we’re lucky, the sampler on the final 1 would some time change it to a 3: 1 1 1 1 3 3 then this could happen again later: 1 1 1 3 3 3 and again: 1 1 3 3 3 3 and once again: 1 3 3 3 3 3 and only now, finally, if we’re lucky, the sampler operating on the first value of the z vector could change this group assignment from 1 to 2: 2 3 3 3 3 3 And that’s the only situation when individuals can possibly change groups in this model.
The problem again, is that any individual seen > 1 time (it’s resighted after the first observation occasion) can never change group assignments away from their initial value group assignment. So, this model is destined for failure.
There are many ways one could fix this: - Marginalize over the z’s as you did (perhaps using nimbleEcology) - Write the model differently, using a binary latent indicator variable to represent the group assignment of each individual. This could also work for > 2 groups, where the group indicator variable follows a categorical prior - Use a latent state formulation as you have, but write a custom MCMC sampler to update entire rows of the z matrix (the latent state variables for one individual for all time periods) simultaneously, thus enabling transitions between groups - Probably other ways, also.
7.2 Trap dep
Multievent formulation à la Pradel and Sanz-Aguilar (2012). Also add example w/ individual time-varying covariate.
7.3 Transience
Multievent treatment à la Genovart and Pradel (2019). Remind of the two age-classes on survival technique.
7.4 Temporary emigration
Multistate treatment as in Schaub et al. (2004). See example in Bǎncilǎ et al. (2018).
Transition matrix:
\[\begin{matrix} & \\ \mathbf{\Gamma} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=\text{in} & z_t=\text{out} & z_t=\text{D} \\ \hdashline \phi (1-\psi^{\text{in} \rightarrow \text{out}}) & \phi \psi^{\text{in} \rightarrow \text{out}} & 1 - \phi\\ \phi \psi^{\text{out} \rightarrow \text{in}} & \phi (1-\psi^{\text{out} \rightarrow \text{in}}) & 1 - \phi\\ 0 & 0 & 1 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right ) \begin{matrix} z_{t-1}=\text{in} \\ z_{t-1}=\text{out} \\ z_{t-1}=\text{D} \end{matrix} \end{matrix}\]
Observation matrix:
\[\begin{matrix} & \\ \mathbf{\Omega} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} y_t=1 & y_t=2 \\ \hdashline 1 - p & p\\ 1 & 0\\ 1 & 0 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right ) \begin{matrix} z_{t}=\text{in} \\ z_{t}=\text{out} \\ z_{t}=\text{D} \end{matrix} \end{matrix}\]
7.5 Memory model
How to make your models remember?
So far, the dynamics of the states are first-order Makovian. The site where you will be depends only on the site where you are, and not on the sites you were previously. How to relax this assumption, and go second-order Markovian?
Memory models were initially proposed by Hestbeck, Nichols, and Malecki (1991) and Brownie et al. (1993), then formulated as HMMs in Rouan, Choquet, and Pradel (2009). See also D. J. Cole et al. (2014).
Remember HMM model for dispersal between 2 sites
Transition matrix
\[\begin{matrix} & \\ \mathbf{\Gamma} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=A & z_t=B & z_t=D \\ \hdashline \phi^A (1-\psi^{AB}) & \phi^A \psi^{AB} & 1 - \phi^A\\ \phi^B \psi^{BA} & \phi^B (1-\psi^{BA}) & 1 - \phi^B\\ 0 & 0 & 1 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right ) \begin{matrix} z_{t-1}=A \\ z_{t-1}=B \\ z_{t-1}=D \end{matrix} \end{matrix}\]
Observation matrix
\[\begin{matrix} & \\ \mathbf{\Omega} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} y_t=1 & y_t=2 & y_t=3 \\ \hdashline 1 - p^A & p^A & 0\\ 1 - p^B & 0 & p^B\\ 1 & 0 & 0 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \end{matrix} } \right ) \begin{matrix} z_{t}=A \\ z_{t}=B \\ z_{t}=D \end{matrix} \end{matrix}\]
HMM formulation of the memory model
To keep track of the sites previously visited, the trick is to consider states as being pairs of sites occupied
States
AA is for alive in site A at \(t\) and alive in site A at \(t-1\)
AB is for alive in site A at \(t\) and alive in site B at \(t-1\)
BA is for alive in site B at \(t\) and alive in site A at \(t-1\)
BB is for alive in site B at \(t\) and alive in site B at \(t-1\)
D is for dead
Observations
1 not captured
2 captured at site A
3 captured at site B
Vector of initial state probabilities
\[\begin{matrix} & \\ \mathbf{\delta} = \left ( \vphantom{ \begin{matrix} 12 \end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=AA & z_t=AB & z_t=BA & z_t=BB &z_t=D \\ \hdashline \pi^{AA} & \pi^{AB} & \pi^{BA} & \pi^{BB} & 0\\ \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \end{matrix} } \right ) \begin{matrix} \end{matrix} \end{matrix}\]
where \(\pi^{BB} = 1 - (\pi^{AA} + \pi^{AB} + \pi^{BA})\), and \(\pi^{ij}\) at site \(j\) when first captured at \(t\) and site \(i\) at \(t - 1\).
Transition matrix
\[\begin{matrix} & \\ \mathbf{\Gamma} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \\ 12 \\ 12\end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=AA & z_t=AB & z_t=BA & z_t=BB & z_t=D \\ \hdashline \phi^{AAA} & \phi^{AAB} & 0 & 0 & 1 - \phi^{AAA} - \phi^{AAB}\\ 0 & 0 & \phi^{ABA} & \phi^{ABB} & 1 - \phi^{ABA} - \phi^{ABB}\\ \phi^{BAA} & \phi^{BAB} & 0 & 0 & 1 - \phi^{BAA} - \phi^{BAB}\\ 0 & 0 & \phi^{BBA} & \phi^{BBB} & 1 - \phi^{BBA} - \phi^{BBB}\\ 0 & 0 & 0 & 0 & 1 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \\ 12 \\ 12\end{matrix} } \right ) \begin{matrix} z_{t-1}=AA \\ z_{t-1}=AB \\ z_{t-1}=BA \\ z_{t-1}=BB \\ z_{t-1}=D \end{matrix} \end{matrix}\]
\(\phi^{ijk}\) is probability to be in site \(k\) at time \(t + 1\) for an individual present in site \(j\) at \(t\) and in site \(i\) at \(t - 1\)
Transition matrix, alternate parameterization
\[\begin{matrix} & \\ \mathbf{\Gamma} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \\ 12 \\ 12\end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} z_t=AA & z_t=AB & z_t=BA & z_t=BB & z_t=D \\ \hdashline \phi \psi^{AAA} & \phi (1 - \psi^{AAA}) & 0 & 0 & 1 - \phi\\ 0 & 0 & \phi (1 - \psi^{ABB}) & \phi \psi^{ABB} & 1 - \phi\\ \phi \psi^{BAA} & \phi (1 - \psi^{BAA}) & 0 & 0 & 1 - \phi\\ 0 & 0 & \phi (1-\psi^{BBB}) & \phi \psi^{BBB} & 1 - \phi\\ 0 & 0 & 0 & 0 & 1 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \\ 12 \\ 12\end{matrix} } \right ) \begin{matrix} z_{t-1}=AA \\ z_{t-1}=AB \\ z_{t-1}=BA \\ z_{t-1}=BB \\ z_{t-1}=D \end{matrix} \end{matrix}\]
\(\phi\) is the probability of surviving from one occasion to the next. \(\psi_{ijj}\) is the probability an animal stays at the same site \(j\) given that it was at site \(i\) on the previous occasion.
Observation matrix
\[\begin{matrix} & \\ \mathbf{\Omega} = \left ( \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \\ 12 \\ 12\end{matrix} } \right . \end{matrix} \hspace{-1.2em} \begin{matrix} y_t=1 & y_t=2 & y_t=3 \\ \hdashline 1 - p^A & p^A & 0\\ 1 - p^B & 0 & p^B\\ 1 - p^A & p^A & 0\\ 1 - p^B & 0 & p^B\\ 1 & 0 & 0 \end{matrix} \hspace{-0.2em} \begin{matrix} & \\ \left . \vphantom{ \begin{matrix} 12 \\ 12 \\ 12 \\ 12 \\ 12\end{matrix} } \right ) \begin{matrix} z_t=AA \\ z_t=AB \\ z_t=BA \\ z_t=BB \\ z_t=D \end{matrix} \end{matrix}\]
7.6 Posterior predictive check
Classical m-array (minimal sufficient statistics for CJS model) as in Paganin and de Valpine (2023). Individual performance in Chambert, Rotella, and Higgs (2014) and Nater et al. (2020). Sojourn time is geometric assumption in Conn et al. (2018).
For the CJS model, we would use the so-called m-array which gathers the elements \(m_{ij}\) for the number of marked individuals initially released at time \(i\) that were first detected again at time \(j\).
Refer to a case study. With m-array and Nimble functions. Refer to paper by Paganin & de Valpine and use code in https://github.com/salleuska/fastCPPP. Also papers by Chambert et al. (individual performance) and Conn et al. (geometric time, and hidden semi-Markov models).
Check out https://r-nimble.org/nimbleExamples/posterior_predictive.html.