What you see is not what you get: Hidden Markov models and capture-recapture data
Question Answer(s)
Does Nimble not like zeros? is it that's why you use 2 instead of 0? Not in general. It's no problem to code a Cormack-Jolly-Seber model using 0 and 1, for example. It all depends on how you define the likelihood.
NA ah wait... here we are assuming the animal is for sure dead. it is not detection - non detection...
NA Worth noting here also that when we work with categorical distributions - which we typically do with multistate mark-recapture models - the 0 is not allowed. That is because the latent state also functions as a dynamic index in these cases, and an index of 0 does not work.
Is the dead state here from dead recovery data or an unobservable state? no dead recovery data here
NA dead correspond to the ‘real’ state, that is not observed - corresponding observation would be ‘not detected’ all the time
NA Roger, gotcha! Thanks!
NA just a brief follow-up; you should not use "not detected" data here? Makes sense that you need to know whether your individuals are dead or alive for real in a certain year to be able to talk about survival, right?
how can we include in these approaches the experimental design, for example plots and blocks? we will cover this in next sessions :smiley: just running through the basic principle of the model here
NA great! thanks
NA briefly: covariates can be included in a ‘regression-like’ fashion on phi or p parameters
NA I was thinking in terms of linear mixed effects models and hence including the experimental design as random effects
NA yes it is possible :smiley:
NA ok. i will wait to learn that. thanks
NA we’ll see how to include random effect and covariates in session 5
Can we write the likelihood vectorized instead of a loop as Chloe suggested? see this other question here
NA Can you say more what you mean? You can do vectorized algebra in model code. You can't use a vectorized distribution in model code. To do that, you would have to write the version of the distribution you want as a nimbleFunction ("Class 8") which can simply use a vectorized call to the distribution internally. It is not hard but is a little higher on the learning curve.
NA Thanks Perry. I thought looping was generally replaced for vectorized operations in NIMBLE.
Does nimble allow for block definitions. For instance if you had a more complex model with time. Could you use gamma[1:2,1, 1:5]&lt;-0 or do you have to go index by index ? Hi Ana. This works, at least over 1 dimension, i.e. gamma[1, 1, 1:5] <- 0. will get back to you about multiple dimensions :slightly_smiling_face:
NA nimble is strict about scalar vs. vector vs. matrix, so I think you would need to create the matrix of 0's rather than assigning a scalar to a matrix, so it is not always exactly as you would do in R.
NA Actually, it does work with a matrix. In general, points of confusion sometimes arise when assigning between scalar, vector, or matrix, but in this case it works.
Perhaps I missed it (in which case sorry!), but must you give an initial parameter function or value(s)? Perhaps you can leave it blank for automagically generated inits? Haha "automagically generated inits". I love that :male_mage: In any case: it's true that the MCMC engine will "pick" initial values if you don't supply any. BUT: it may not pick good ones, and then you'll end up with initialisation problems. It's always better to provide initial values.
NA Don't trust that kind of magic :wink:¨
NA This is a rather technical reply, but here goes: The default MCMC sampling methods used by nimble have a somewhat harder time getting out of bad initial values than those used by JAGS, for example, so people experience higher need to choose decent initial values. If you have no idea, you can do a preliminary run. It is also possible to configure the MCMC sampling methods, which will be covered in "Class 8".
NA Thanks . I will keep that in mind. I'm in the habit of using default jags, stan inits, i.e., not specifying them and allowing defaults.
NA Makes sense. The default inits will be drawn from the prior, and those can be far off from the posterior region of interest. That can also impact the various kinds of self-tuning that each system does, because they may start to tune to regions that aren't like the region of interest.
What is the Hidden part in this particular model? if an individual is not detected you actually don’t know in which state it is (alive and unseen or dead)
NA I assume you mean not detected until the end of the study, as if it is not observed at time say 3 but I detected it at time 4 then I also know it must have survived so I know its state?
NA Thx, but then the assumptions of a dead individual remaining dead does not work. An undetected individual could be re-detected in subsequent sampling occasions ... Or am I missing something?
NA yes but if it’s never detected again you won’t know... now imagine if there are more states than just alive and dead (example in site 1, 2 etc), you might not know the particular state at the time you did not detect the individual
NA yes and no. At the moment, we are not yet considering "intrapolation" information on known states. Think if there were multiple alive states: you would not know the state between two observations.
NA OK that makes sense!
NA we set this in the transition matrix, once you’re dead (line2) the probably to transition to state alive is set to 0
NA it's important to keep the "state" process (alive vs. dead) and the "observation" process separate. A non-detected animal may be dead. A dead animal will never be detected (in this example)
NA Ok, but in this particular model I do not see where this observation process is ...
NA more details in next session :smiley:
NA Awesome! Looking forward to it.
NA Ok so the model so far was a Markov model and not an Hidden Markov model, my bad
NA you’re right here he was assuming perfect detectability
If I understood correctly, here we assume we know whether or not individuals are dead. How do you account for the individuals who may not truly be dead, but are merely absent in a particular sampling occasion? (I realise this scenario may come later in the workshop) more about this in the next session =)
NA but I guess here already you are not assuming you know an individual is either alive or dead if you have not seen it, it might just be alive and undetected and the likelihood accounts for that
NA OK thanks Sarah, I suspected this may be the case. Sorry for the impatience. :slightly_smiling_face:
NA Ignore my comment :slightly_smiling_face: you are right in this case we assume perfect detectability - dah
What was the code using the command MCMCtrace in the live demo? I wasn't able to type it down. example : MCMCtrace(object = mcmc.phip, ISB = FALSE, exact = TRUE, params = c(“phi”), pdf = FALSE, priors = priors)
NA Thanks!
NA you can dowload all the script on the website
NA click on ‘live demo’
NA
For models with partially observable states, the initialization step can be really painful with JAGS. Is nimble more friendly?:no_mouth: Hi Manu :wave: oh yes...very painful !! not sure about this, Chloe and Perry should know
NA No and yes. No, because if you work with latent state models, even with nimble, you have to provide possible initial values for latent states for nimble just like for JAGS. Yes, because you can actually forget about latent states altogether and work with marginalized likelihoods instead. More on that in lecture 8 :wink:
NA What exactly do we mean by "partially observable"? Does that mean there are periods when some individuals could not be observed, i.e., missing data in the detection matrix?
NA The main thing I see from user questions is that you must provide inits that are valid in the model. E.g. if the initial states involve an impossible (zero probability) transition or observation, then the initial log probability sum in the model will be -Inf, and the sampling struggles to get started and can really fail.
NA What I think makes things easier in nimble is that the model is an object in R and you can access and modify its variables and you can calculate each part of the model as you choose, so you can track down where you have a problem. This will be briefly touched upon in "Class 8".
NA Yes, it's the inital latent state that usually causes headaches with JAGS, so nimble seems to provide means of making our lives easier, GREAT (Class 8 :sunglasses:)!
NA Yes, I meant data where information on true state is not directly available, like in any CMR or occupancy model ("patially observable states"). I think that state uncertainty (another issue) will be covered tomorrow. I don't know if this specific source of uncertainty raises problems in the initialization step.
I got a bit confused on slide 140/402 of the pdf with all slides, the example that <@U01ULDNCVKP> was just talking - 2 occasions, the animal is detected, then missed, that should be either (1-p) \phi OR 1-\phi, not simply (1-p) \phi? did I miss something? Ping .
NA Yes, I realized while talking that something was wrong. I'll fix it tonight, or tomorrow.
Important addition:
You can still use capture histories with 0 and 1 for basic Cormack-Jolly-Seber and Brownie models, as long as the likelihood is formulated using a Bernouilli likelihood (instead of the categorical we are demonstrating here).
I’ll make sure to mention it again in next session
How do you implement the zinits when you have recoveries ? that’s the tricky bit, you have to make sure that the inits are compatible with the model, like in jags, so if dead-recovery is for example 2, you cannot have a ‘1’ for alive after this for the same individual...quite painful !
NA If you have recoveries in addition to observations, you need an additional state "newly dead" in your model. Your z initis then have to be "newly dead" at the time of recovery, and "dead" afterwards. More tomorrow during multistate models.
I wonder how the likelihood formula works, in the slides we have:
```z[i, j] ~ dmultinom(gamma(...))```
But I am under the impression that the dmultinom distribution expect several (2 in this case) values, while we only pass one.
Gamma is a matrix, the row is selected by z and then two probabilities (columns) are passed to the dmulti
NA Ok but say I want to know the probability of a certain observed values from a multinomial distribution in R: `dmultinom(1, 1, c(0.2, 0.8))` This returns an error because the function seems to be expecting more than one observed values. It is just that I am not very familiar with this distribution, with a normal one it is easy to get at the same thing: `dnorm(0, 0, 1)`
NA Lionel, try passing a vector of zeros with a single 1 for the realization you are interested in.
NA dmultinom(c(1,0), 1, c(0.2, 0.8))
NA Works but in the Nimble code this distribution apparently works with just one values: `z[i, j] ~ dmultinom(...)` Which is a bit confusing to me ...
Does delta serve a role at this point (i.e. contributing two values to the likelihood) or is it just a place holder for upcoming models that allow transitions? In other words could the dcat(delta) just be replaced with a &lt;- 1 for this model? delta is actually a vector (1,0) for each possible state, here 1 for state alive and 0 for state dead, but imagine we could have more states (see tomorrow’s session)
NA but it is just serves to condition upon first capture because some individuals enter the data set at later capture occasions
NA Thanks. I assumed it was kind of a place holder at this point until we had more states. Because in the survival model we have to condition upon first capture. In this specific case would it change the estimation if z[i,first[i]] dcat(delta[1:2]} was replaced with z[i,first[i]] <- 1?
NA Placeholder yes. Useful for more complex models to come.
What if phi and p differs between each step ? How can we manage ? all about this in next session =)
For individuals that weren't born until later time points, would you still code zinits as alive, even if you know they were not? Typically, we condition on first capture, meaning individuals only enter "the data" when they are captured and marked for the first time.
NA Ah okay first capture rather than first time point/survey. Thanks :slightly_smiling_face:
NA you can code NA before first capture
NA In the next lecture, you will see that we analyse capture histories from each individual's "first" encounter occasion, not from the very first sampling occasion :wink:
Thanks for the presentation Sarah, can the same methods be used in Resource Selection Functions using Generalised Linear Mixed Models.. your examples are mostly based on binary data (0, 1).. I think this question should go in channel quiestions-5-survival-estimation :slightly_smiling_face:
The HMM Likelihood is the sum (over all states) of the products of Omegas and the products of gammas (slide 136 of pdf). In the Nimble code (slide 146), we specify the z~ and y~ but there is no need to specify that the likelihood is the sum of the products. I think I missed why that is the case. Could you please explain why? I am interested to know how this magic happens, and if this applies only in this specific case, or it is a general feature. Thanks! (and sorry for the delayed Q - I am watching the recorded session from Canada's west coast) In the Nimble code, we use the full likelihood parametrized with parameters (phi, p) and latent states (z), because we can estimate both these quantities with MCMC. Often, we don't need to estimate the latent states, we're just after the parameters. In such a case, we may get rid of the states by summing over them - this is called marginalization. Marginalization often helps to reach convergence faster (see Yackulic paper in further reading). We'll see in class 8 that Nimble provides the marginalized likelihood through nimbleEcology. Which means that in the Nimble framework we can use both the full likelihood and the marginalized likelihood, according to our needs. Isn't that awesome! Note that the marginalized likelihood is what we use to get max lik estimates, although it's possible to get the latent states after the estimation step using the HMM toolbox (and the Viterbi algorithm). Hope it helps.
Beginner question in CMR : The introduced models always assume a 1-year lag between two successive observations. What is the impact of this assumption on the parameter estimates if observations are actually spread over several months among years? Consequently, is there any way to implement CMR models based on continuous time? That's something we didn't cover, thank you for asking! You need to have some time between the sampling occasions, and the sampling occasions should be kind of instantaneous to avoid having survival or dispersal occurring within it. This paper offers a nice discussion of CMR assumptions . If you need to go for continuous models, check out this nice paper and references therein.
NA Thanks you :slightly_smiling_face:
NA Hi ! I'll just add to what Olivier said that instead of a continuous model, you may also want to look into: 1. Different length of time-steps. I.e. the interval could be 1 month, or 2 weeks, or whatever, instead of 1 year. Similarly, it can also be 2 or more years. This is up to you, just keep in mind that survival probabilities are specific to your chosen time interval :wink: 2. You can "aggregate" / discretize capture occasions. For example, if you collect data every year from June-August, you can just treat the whole June-August block as the annual sampling occasion (and lump all observations together). This is often a pragmatic solution. The assumption that comes with it is that within the sampling occasion, the population is closed (no immigration/emigration) and there is no mortality.
NA You may also want to check out "Pollock's robust design" for a combination of 1. and 2. above :wink:
Hi hi, I don't know if there is still somewhere there - in case... I am trying to replicate section 4 and 5 with my own data, and I keep getting the following error: &lt;defining model...
&lt;building model...
&lt;setting data and initial values...
&lt;running calculate on model (any error reports that follow may simply reflect missing values in model variables) ...
&lt;checking model sizes and dimensions...Error in is.nan(value) : default method not implemented for type 'list'
I am like 99.9999% sure it's a dumb issue coming from something very dumb I am doing (or not doing). But in case that's a classic beginner issue and someone can hint at things I should check I would be most grateful. Thanks.
Happy to help, can you send me the data and code to reproduce the problem? We'll report back here.
NA Thanks Olivier. Doing this asap.
NA Will we report even if I am veryyyy dumb? :sweat_smile:
NA Pick your character, I'll be the other one :slightly_smiling_face:
NA ok...so: my capture-recapture matrix (named y in the code provided during the workshop) was actually of class data.frame, and this was the problem. Thanks to Olivier for checking my code and pointing this out. Expertise lies in the details right? So a simple y <- as.matrix(y) does the job. Thanks Olivier.