Studying population dynamics is essential for understanding the complex interactions between species and their environment. By modeling population dynamics, researchers can gain an understanding of how animals respond to environmental changes and activities. This understanding is vital for conservation efforts as it aids in identifying endangered species and devising strategies to safeguard biodiversity. Furthermore, animal populations can act as reservoirs for infectious diseases, impacting both human and animal health. By comprehensively understanding population movements, scientists can predict outbreaks, discern transmission patterns, and enhance disease control and prevention measures.
This report extends research by Hjeljord and Loe, who explain the dwindling numbers of willow ptarmigan Lapagos lapagos in Northeastern Scandinavia [1]. Although not in immediate danger, the willow ptarmigan faces threats from habitat loss, climate change, and hunting and trapping. Hjeljord and Loe postulate that, along with climate, a long-term dampening of the amplitude in small rodents and an increase in red fox numbers, have prevented willow ptarmigan populations from reaching their peaks seen a hundred years ago [1]. Their analysis implements linear models with a count proxy as a function of time to estimate linear change and wavelet analysis to detect cyclic periods. This report’s aim is to further Hjelford and Loe’s work by capturing the stochastic population dynamics and the role of alternative prey using the partially observed markov process (POMP) framework. POMP models allow researchers to make inferences about the underlying dynamics of the system by linking observed and unobservable variables. Applications of these models have been carried out extensively in finance and epidemiology, but far less so in ecological systems.
Our data includes 142 observations from 1872 to 2012. The harvest data (CPUE) was recorded by hunters in the Southeastern mountain regions of Norway. The data is provided by the authors of the study and is hosted by Dryad here. A description of the dataset provided by the authors is given below:
year | logCPUE | R |
---|---|---|
1872 | 3.555348 | 1 |
1873 | 2.995732 | 0 |
1874 | 2.995732 | 0 |
1875 | 3.433987 | 1 |
1876 | 3.465736 | 1 |
1877 | 2.708050 | 0 |
1878 | 1.945910 | 0 |
1879 | 2.995732 | 0 |
1880 | 3.401197 | 1 |
1881 | 2.890372 | 0 |
Looking at both CPUE and log.CPUE in Figure 2.1, it’s evident that ptarmigan populations has steadily decreased over the 142 year-long period. The blue shaded regions in plot C highlight years when peak rodent year is equal to 1 signifying yes. Notice how both plots exhibit peaks in the shaded regions and troughs otherwise, which is exactly what we expected to see from the alternative prey hypothesis. It’s also worth noting that this trend is far less apparent beyond the early 1900s. Figure 2.2 displays side-by-side boxplots of logCPUE vs. peak rodent year, confirming that ptarmigan counts are generally higher in years of rodent abundance. Note: in accordance with the previous study, our analysis will use log.CPUE to reduce heteroscedasticity.
Figure 2.3 depicts the ACF plot of log.CPUE at 100 lags. The plot reinforces what we already know - the time series is not stationary, or that future values of the time series are correlated with past values. More specifically, the initial 40 lags show strong correlation with the strongest being the first 5. The PACF plot in Figure 2.1, the partial correlation dies out after 5 lags. For an ARMA model, it may be appropriate to select an \(AR(p=5)\) component.
Given the long-range significance in the ACF plot and obvious non-stationarity, differencing the series may be necessary to fit a simple time series model to the data. Moreover, the ACF appears to be approaching a seasonal behavior, however we cannot apply STL or classical decomposition procedures as our sample of data has less than 2 periods. The repeated short-term cycle in 5 lags could be explained by the partial correlation. That being said, the purpose of this report is to explore the pomp framework, thus we elect to use a very simple ARMA process to serve as a basis for our more sophisticated models.
We will start off by fitting a simple ARMA model with parameters chosen using the algorithmic approach, grid search. This simple model will serve as a reference against which the performance of the more advanced POMP model is measured. In theory, the ARMA model will serve as a yardstick for managing expectations. Of the models shown in Table 3.1, we select the one with the Akaike’s information criterion (AIC). AIC is essentially minus twice the maximized log likelihood plus twice the number of parameters, and is defined by:
\[AIC=-2\times\ell(\theta^*)+2D\]
ARMA model can’t be performed on the non-stationary time series data. The KPSS test is implemented for a stationarity check.
## Warning in tseries::kpss.test(ts_cpue, null = "Level"): p-value smaller than
## printed p-value
##
## KPSS Test for Level Stationarity
##
## data: ts_cpue
## KPSS Level = 2.1658, Truncation lag parameter = 4, p-value = 0.01
## Warning in tseries::kpss.test(ts_diff, null = "Level"): p-value greater than
## printed p-value
##
## KPSS Test for Level Stationarity
##
## data: ts_diff
## KPSS Level = 0.022198, Truncation lag parameter = 4, p-value = 0.1
KPSS with p-value = 0.01 suggests that the average daily hunting of birds(CPUE) in log scale is non-stationary. However, first-order difference on this time series data is stationary, proved by the p-value larger than \(\alpha=0.05\) from KPSS test. Then the following ARMA(p,q) model will be run on such differenced data.
MA0 | MA1 | MA2 | MA3 | MA4 | MA5 | |
---|---|---|---|---|---|---|
AR0 | 243.32 | 230.80 | 223.58 | 221.22 | 223.19 | 204.21 |
AR1 | 238.41 | 216.56 | 218.55 | 217.95 | 219.84 | 206.21 |
AR2 | 226.49 | 227.49 | 220.37 | 219.71 | 217.91 | 207.56 |
AR3 | 226.66 | 222.36 | 221.82 | 207.03 | 216.42 | 208.60 |
AR4 | 224.66 | 222.97 | 224.86 | 213.67 | 208.54 | 209.68 |
The best ARMA(0,5) is selected given the lowest AIC with 204.48 to fit first-order differenced data. Thus, ARIMA(0,1,5) will be run below to check for its log-likelihood.
We choose the ARIMA(0,1,5), as out simple benchmark model. The corresponding log-likelihood value comes out to be -99.32.
Partially observed Markov process models, also known as state-space models, hidden Markov models, and stochastic dynamical system models are probabilistic models used to describe the evolution of a system over time when some of the system’s variables are “hidden” or unobservable [8]. It consists of two main components: a latent process model, representing the unobservable states of the system, and a measurement model, linking the latent states to the observed data [9]. By incorporating both observed and unobserved variables, POMP models allow researchers to make inferences about the underlying dynamics of the system. A simple visual representation of the process is depicted in the image below.
We propose modeling the pomp framework using a variation of the Lotka–Volterra equations [10]. The Lotka-Voltera equations, also referred to as the predator-prey equations, are a pair of first order nonlinear differential equations. They are frequently used to describe the dynamics of biological systems in which two species interact, one as a predator and the other as prey. Our analysis focuses on a slightly different approach that introduces the idea of an alternative prey. The alternative prey hypothesis supposes that predators supported by a primary species will shift to consume alternative prey during a decrease in primary prey abundance. Despite this occurring in many systems, the mechanisms are poorly understood [11], and so we offer an avenue to explore by introducing the pomp framework.
The measurement model \(Y(t)\) is our ptarmigan count proxy, logCPUE, \(y(t) =Negative\ Binomial(mean=\rho\beta_t, \sigma)\), and the process model \(X(t)\) includes predator (fox) population density, preferred (rodent) prey population density, and alternative prey (bird) population density. We model how the log of the populations of foxes and ptarmigans change through time in Equations (3.1) and (3.2).
\[\begin{equation} \log F_{t+dt} = \log F_t + dt(bR_t + c\exp(\log B_t)[1 - \gamma R_t] - a)W_t^F \tag{3.1} \end{equation}\]
\[\begin{equation} \log B_{t+dt} = \log B_t + dt(\alpha + \beta \exp(\log F_t)[1 - \gamma R_t])W_t^F \tag{3.2} \end{equation}\]
In equation (3.1):
Similarly, in equation (3.2):
We start by fitting our model to the data using a local search. Code for the model can be viewed using the black drop-down boxes throughout the report. Version 5.6 of the pomp
package was used to carry out the analysis. We use a particle filter applied to a pomp object over time and assess its performance. We initialize the process with a guess of parameter values given by the following: \(log(F_0) = 1, log(B_0) = 1, a = 1, b = 2, c = 1, \alpha = 1, \beta = 1, \gamma = 0.5, \sigma_F = 0.1, \sigma_B = 0.1, log(\rho) = 3, \sigma_{obs} = 0.1\)
Csnippet("
double dwF, dwB;
dwF = rgammawn(sigmaF, dt);
dwB = rgammawn(sigmaB, dt);
logF += (b*R + c*exp(logB)*(1-gamma*R)-a)*dwF;
logB += (alpha - Beta*exp(logF)*(1-gamma*R))*dwB;
") -> step
Csnippet("
logF = logF_0;
logB = logB_0;
") -> rinit
Csnippet("
logCPUE = rnorm(logB - logRho, sigma_obs);
") -> rmeas
Csnippet("
lik = dnorm(logCPUE, logB - logRho, sigma_obs, give_log);
") -> dmeas
The effective sample size (ESS) in the second panel of Figure 3.1 displays a presence of occasional spikes interspersed with predominantly low values. It suggests a particle degeneracy issue. In other words, only a few samples are providing meaningful information, which lead to a less robust and a less reliable estimate. The third panel shows the conditional log-likelihood over time which is a measure of how well the model with its current parameters. The log-likelihood appears to be fluctuating and showing some periods of higher values interspersed with periods of significant decline, indicating varying model performance over time. We can conclude that our initial guess of parameter values is poor and we will do a local search of parameters for a more stable model performance.
A single execution of the mif2
function applied to model provides a snapshot of the model’s behavior under a specific set of initial conditions.
## est se
## -288.64760 30.88034
The estimated log-likelihood is -205 with standard error of 3.14.
Iterated Filtering (IF2) is implemented by a foreach
loop which iterates 20 times, and in each iteration of the loop the IF2 function is applied to our model. This is a good way to explore the variability in outcomes due to initial conditions and other stochastic elements of the modeling process as well as to control the robustness. The variation in trajectories across the iterations suggests that the algorithm is exploring the parameter space and adjusting the estimates as it receives more information from the data.
It estimated log likelihood of -134 which is larger than a single execution of mif2
function, but with a larger standard error of 8.2.
From the above plot, the likelihood over iterations quickly rises and then levels off. Towards the later iterations, the changes in parameter estimates become smaller, indicating that the search for the maximum likelihood is flattening out. This can be a sign that the search is approaching a local maximum, but the fact that the changes are small but not zero implies that there might still be some uncertainty. But not all parameters show signs of convergence, such as \(log(B_0), \sigma_F\).
We find that the higher likelihood is usually associated with lower values of \(\alpha, \gamma, log(\rho), \sigma_{obs}\). \(log(F_0), log(B_0)\), the initial conditions for Fox population and birds population in log scale, shows a general upward trend over iterations. \(a, b, c, \sigma_F, \sigma_B\) shows a general flat trend near zero except for a few iterations with a dramatically increasing trend.
A scattered or cloud-like pattern indicates little to no apparent linear correlation.
The best estimated log likelihood under global search by the use of GreatLakes is -176.3, which is a lower than the log likelihood from ARIMA (log-likelihood with -99) and local search(log-likelihood with -134). It is opposite to our expectation. One of the reasons is that we limit the parameter space to save the running time on Great Lakes, otherwise it will take more than 8 hours to run given the fact that our model is built with a large number of parameters. The accuracy will be increased if we increase the parameter space.
As a result, he log-likelihood produced an ARIMA(0,1,5) model is better than that from a local search of a POMP model, it implies that the ARMA model is fitting the data better according to this metric.
Our report probes the potential of applying POMP to the field of ecological population dynamics. It ends with ARIMA(0,1,5) produces the highest log-likelihood with -99. We began with the establishment of a foundational ARMA model to serve as a reference point against which we evaluated our proposed POMP models. Drawing inspiration from the Lotka-Voltera equations, we introduce our take on the concept of an alternative prey hypothesis. Although our analysis is on the side of preliminary, it shows this approach can shed more light on the dynamic responses of ecosystems to changes in prey abundance.