simulations

Illustration of a simulation with the RLA operating model

This vignette is to showcase some functions of the RLA package, using the data on the harbour porpoises (Phocoena phocoena) in the North Sea. The data is included in the package.

library("tidyverse")
#> -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
#> v ggplot2 3.3.3     v purrr   0.3.4
#> v tibble  3.1.0     v dplyr   1.0.5
#> v tidyr   1.1.3     v stringr 1.4.0
#> v readr   1.4.0     v forcats 0.5.1
#> -- Conflicts ------------------------------------------ tidyverse_conflicts() --
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag()    masks stats::lag()
library("knitr")
library("RLA")
library("rstan")
#> Loading required package: StanHeaders
#> rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
#> For execution on a local, multicore CPU with excess RAM we recommend calling
#> options(mc.cores = parallel::detectCores()).
#> To avoid recompilation of unchanged Stan programs, we recommend calling
#> rstan_options(auto_write = TRUE)
#> Do not specify '-march=native' in 'LOCAL_CPPFLAGS' or a Makevars file
#> 
#> Attaching package: 'rstan'
#> The following object is masked from 'package:tidyr':
#> 
#>     extract
options(mc.cores = parallel::detectCores())
rstan::rstan_options(auto_write = TRUE)

data("north_sea_hp")
data("scenarios_north_sea_hp")

Scenarios

The table scenarios_north_sea_hp has all the information to simulate populations to a desired level of depletion (see the vignette on ‘scenarios’ to learn on how these were generated).

scenarios_north_sea_hp %>% 
  head(5) %>% 
  kable()
current_depletion MNP MNPL level cv_obs cv_env cv_byc rate seed
0.5417200 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 271317
0.4005977 1.02 0.5 (0.4,0.6] 0.239 0.065 0.388 0.021 921891
0.4947520 1.02 0.5 (0.4,0.6] 0.193 0.110 0.430 0.012 803073
0.5601648 1.02 0.5 (0.4,0.6] 0.298 0.155 0.428 0.008 737284
0.4948804 1.02 0.5 (0.4,0.6] 0.226 0.070 0.179 0.014 191156

For example, we’ll use the first line in the table to simulate a population dynamics with pellatomlinson_rla. It is important here to use the seed stored in the table to achieve the desired level of depletion of the population. All inputs needed to carry out a simulation on the harbour porpoise in the North Sea are stored in the list north_sea_hp$life_history and the table scenarios_north_sea_hp.

dyn <- pellatomlinson_rla(MNPL = scenarios_north_sea_hp$MNPL[1],
                          K = north_sea_hp$life_history$K,
                          L = north_sea_hp$life_history$L,
                          eta = north_sea_hp$life_history$eta,
                          phi = north_sea_hp$life_history$phi,
                          m = north_sea_hp$life_history$maturity,
                          MNP = scenarios_north_sea_hp$MNP[1],
                          # series of catches
                          catches = bycatch_rw(n = 51,
                                               K = north_sea_hp$life_history$K,
                                               rate = scenarios_north_sea_hp$rate[1],
                                               cv = scenarios_north_sea_hp$cv_byc[1],
                                               seed_id = scenarios_north_sea_hp$seed[1]
                                               ),
                          # expected coefficient of variation for the survey estimates
                          CV = scenarios_north_sea_hp$cv_obs[1],
                          # CV of environmental stochasticity on birth rate
                          CV_env = scenarios_north_sea_hp$cv_env[1],
                          # scans survey
                          scans = c(29, 40, 51),
                          verbose = TRUE,
                          everything = TRUE,
                          seed_id = scenarios_north_sea_hp$seed[1]
                          )
#>   Burn-in: running population dynamics for 500 years
#>       initial depletion level:  5 % of carrying capacity
#>       Correlated random walk for environmental stochasticity in birth rate
#>       Returning complete output

The simulation output can be visualized (this is just a very small peek at the output of pellatomlinson_rla):

data.frame(time = 1:length(dyn$depletion),
           depletion = dyn$depletion
           ) %>% 
  ggplot() +
  geom_hline(yintercept = 0.8, linetype = "dotted", color = "tomato", size = 2) +
  geom_line(aes(x = time, y = depletion), color = "midnightblue", size = 1) +
  scale_x_continuous(name = "# years", breaks = seq(0, 550, 50)) +
  scale_y_continuous(name = "% depletion", breaks = seq(0, 1, 0.1)) +
  theme_bw()

Here, the population was allowed to reach its carrying capacity, but some \(50\) years ago, a period of anthropogenic removals began. Removals were unmanaged and depleted the population down to \(\approx 55\)% of carrying capacity \(\mathrm{K}\).

It is assumed that \(3\) SCANS-like surveys took place in years \(529\), \(540\) and \(551\) (this spacing corresponds to the actual spacing of the real SCANS surveys):

dyn$survey %>% 
  mutate(scans = 500 + scans) %>% 
  kable()
mean cv scans
324043 0.231 529
148417 0.231 540
311291 0.231 551

Removal Limit Algorithm

The RLA is a management procedure that will simulate the population dynamics for another \(100\) years with catch limits set with updated survey estimates and information on anthropogenic removal. \(T\) is now, and we want to start using the RLA from now on, assuming some sort of management actions will be taken from now on to monitor bycatch (so we’ll get estimates of bycatch). We’re also assuming that a new abundance estimate will be available every \(6\) years. The RLA management procedure consists in using the RLA over a time horizon (\(100\) years) and assess whether the conservation objective is being met over that horizon. Management objectives (setting a catch limit) are being tested in the process.

We will illustrate the management procedure using the median (\(q = 0.5\)) to tune the catch limit. We can assume that there is some uncertainty in the bycatch numbers that are fed into the RLA. This uncertainty is limited to bycatch number before the RLA starts to be used to set a catch limit. There is however variability in the realized number of bycaught animals around the limit set with the RLA. We can feed directly the output of pellatomlinson_rla to forward_rla.

We first need to compile the RLA statistical modeling order to use it. We’ll need the library rstan and the model code stored in stan_models. Here we’ll use uniform priors.

## load Stan models and look at the one with uniform priors
data(rlastan_models)
cat(rlastan_models$uniform)
#> data {
#>  int<lower = 1> n_survey; // nb of surveys
#>  int<lower = 1> n_year; // nb of years with removals
#>   vector<lower = 0.0>[n_survey] SURVEY_mean; // abundance estimates
#>  vector<lower = 0.0>[n_survey] SURVEY_cv;   // cv of abundance estimates
#>  vector<lower = 0.0>[n_year] BYCATCH; // bycatch estimates
#>  real<lower = 0.0> IPL; // Internal Protection Level, same scale as depletion;
#>   // The downweighting is implemented to reduce variability in bycatch limits (Cooke 1999)
#>  real<lower = 0.0, upper = 1.0> W; // weight for down-weighting the likelihood as in the IWC CLA
#>  real<lower = 0.0> UPPER; // upper limit for growth rate
#>   int<lower = 1, upper = n_year + 1> SCANS[n_survey]; // pointers to years of SCANS survey
#> }
#> 
#> transformed data {
#>  vector[n_survey] scale = sqrt(log1p(square(SURVEY_cv)));
#>  vector[n_survey] location = log(SURVEY_mean) - 0.5 * log1p(square(SURVEY_cv));
#> }
#> 
#> parameters {
#>  real<lower = 1E-4, upper = UPPER> r; // population growth rate
#>  real<lower = 1E-4, upper = 1.0> depletion; // depletion at the end of the time-series
#> }
#> 
#> transformed parameters {
#>  vector[n_year + 1] abundance;
#>  real K;
#>  // what's observed in the latest survey is a fraction of carrying capacity
#>  K = SURVEY_mean[n_survey] / depletion;
#>  // the population begins at carrying capacity
#>  abundance[1] = K;
#>  for(t in 1:n_year) {
#>      abundance[t + 1] = abundance[t] - BYCATCH[t] + r * abundance[t] * (1 - square(abundance[t] / K));
#>  }
#> }
#> 
#> model {
#>  for(i in 1:n_survey) {
#>      target += lognormal_lpdf(abundance[SCANS[i]]| location[i], scale[i]) * W;
#>  }
#> }
#> 
#> generated quantities {
#>  real removal_limit;
#>  removal_limit = r * fmax(0.0, depletion - IPL);
#> }
# compile model
rlastan <- rstan::stan_model(model_code = rlastan_models$uniform,
                             model_name = "RLA"
                             )

The object rlastan is of class stanmodel and is a compiled C++ executable that you can now call from your R environment. You just need to compile it once. Please check http://mc-stan.org for more on Stan. We now have everything to apply the management procedure. This is what the function forward_rla does. It cycles between simulating a population dynamics (according to a Pella-Tomlinson process as coded in pellatomlinson_rla) with an estimation step and updating of the catch limit with updated data every 6 years (using the Stan function sampling, which is called by forward_rla. See ?stan::sampling and ?forward_rla).

The function takes two arguments, frequency and horizon (see also the help) which control the frequency at which new information is collected to update the catch limit with Stan (by default, it is every \(6\) years); and for how long should the procedure be run (by default, \(100\) years).

rla <- forward_rla(rlalist = dyn,
                   rlastan = rlastan,
                   q = 0.5,
                   distribution = "truncnorm",
                   # upper bound for the prior on r, the population growth rate
                   upper = 0.1,
                   save_log = FALSE
                   )

This took some time. The management procedure is simulated over \(100\) years by default, with an update very \(6\) years as should be the case with the MSFD. In practice, a new catch limit is thus estimated every six years with the RLA and its Stan implementation.

The dynamics is plotted below, and shows that using the RLA to estimate the catch limit, and setting the actual catch limit at the median of the posterior distribution does not allows the population to recover in this example:

rla$depletion %>% 
  ggplot() +
  geom_line(aes(x = 551 + time, y = depletion), color = "midnightblue") +
  theme_bw()

The summary of the RLA is stored in a table, from which management objective can be assessed:

rla$management %>% 
  mutate(MSFD_cycle = 1:n()) %>% 
  dplyr::select(MSFD_cycle, catch_limit, N_hat, convergence) %>% 
  kable()
MSFD_cycle catch_limit N_hat convergence
1 0.00895 311291 1
2 0.00972 280429 1
3 0.00889 245502 1
4 0.00175 186203 1
5 0.01226 336200 1
6 0.01194 244914 1
7 0.00971 224237 1
8 0.01379 325950 1
9 0.00501 183390 1
10 0.01069 375288 1
11 0.01486 262040 1
12 0.01210 220638 1
13 0.00574 470739 1
14 0.01057 395189 1
15 0.00575 193365 1
16 0.01642 291093 1
17 0.01506 231942 1

Each row in this table corresponds to the start of a MSFD cycle of \(6\) years. This example illustrates how posterior uncertainty is handled in the RLA, resulting in setting a nil limit in some instances (when the current survey estimate is low).

The whole simulation is:

theme_set(theme_bw(base_size = 16))
data.frame(time = c(1:length(dyn$depletion), 
                    length(dyn$depletion) + rla$depletion$time
                    ),
           depletion = c(dyn$depletion, rla$depletion$depletion),
           anthropogenic_removal = c("unmanaged", "managed with RLA")[c(rep(1, length(dyn$depletion)), rep(2, nrow(rla$depletion)))]
           ) %>% 
  ggplot() +
  geom_hline(yintercept = 80, linetype = "dotted", color = "tomato", size = 2) +
  geom_line(aes(x = time - 550, y = 100 * depletion, color = anthropogenic_removal), size = 1) +
  scale_color_manual(values = c("seagreen", "midnightblue")) +
  scale_x_continuous(name = "# years (0 is present)", breaks = seq(-500, 150, 50)) +
  scale_y_continuous(name = "% depletion", breaks = 100 * seq(0, 1, 0.1)) +
  theme(legend.position = "bottom")

Tuning

Tuning is the process of selecting a quantile in the posterior distribution of the quantity \[\begin{equation} \frac{\mathrm{catch~limit}_{T}}{\hat{N}_{T}} = r \times \max\left(0, \mathrm{depletion}_T - \mathrm{IPL}\right) \end{equation}\]

This quantile is then used as the catch limit. The selection is done by testing several quantiles and choosing the one that meets the conservation objective. Here we will assume that the conservation objective is to maintain the population at or above \(80\)% of \(\mathrm{K}\). Several quantiles will be tested. In order to allows use the same seeds across the several test of quantiles, we set the argument random to FALSE.

tuning <- lapply(seq(0.2, 0.8, 0.05), function(q) {
  forward_rla(rlalist = dyn,
              rlastan = rlastan,
              q = q,
              distribution = "truncnorm",
              # upper bound for the prior on r, the population growth rate
              upper = 0.1,
              random = FALSE,
              save_log = FALSE
              )
})

We can now investigate the quantiles for which the population was able to recover and attain the conservation objective.

results <- do.call('rbind', 
                   lapply(1:length(tuning), 
                          function(i) { tuning[[i]]$management %>% mutate(t = 1:n()) })
                   ) %>%
  left_join(scenarios_north_sea_hp,
            by = "seed"
            )

trajectory <- do.call('rbind', 
                      lapply(1:length(tuning), 
                             function(i) { tuning[[i]]$depletion })
                      ) %>%
  left_join(scenarios_north_sea_hp,
            by = "seed"
            )

The table trajectory stores the simulated population dynamics with the various choice of quantiles. These can be contrasted to the scenario in which all anthropogenic removals are banned. This scenario is easy to run with a call to pellatomlinson_rla in which some of the output of the object dyn are now inputs. We should however set arg. random to TRUE in order to obtain make sure that a new seed is used to further the simulation.

ban <- pellatomlinson_rla(Nf = dyn$Nf,
                          Nm = dyn$Nm,
                          z = dyn$z,
                          K = dyn$K,
                          L = dyn$L,
                          eta = dyn$eta,
                          phi = dyn$phi,
                          m = dyn$m,
                          b_max = dyn$b_max,
                          b_K = dyn$b_K,
                          # series of catches
                          catches = rep(0, 100),
                          # expected coefficient of variation for the survey estimates
                          CV = dyn$CV,
                          # CV of environmental stochasticity on birth rate
                          CV_env = dyn$CV_env,
                          rho = dyn$rho,
                          verbose = TRUE,
                          everything = FALSE,
                          seed_id = dyn$seed_id#,
                          # random = TRUE
                          )
#>   No burn-in: running population dynamics from N =  258373
#>       initial depletion level:  52 % of carrying capacity
#>       Correlated random walk for environmental stochasticity in birth rate

All these results can be now visualized:

theme_set(theme_bw())
ggplot() +
  geom_line(data = data.frame(depletion = ban$depletion,
                              seed = ban$seed_id
                              ) %>% 
              mutate(time = 1:n() - 1) %>% 
              left_join(scenarios_north_sea_hp,
                        by = "seed"
                        ),
            aes(x = time, y = depletion), linetype = "dashed", color = "black", size = 2
            ) +
  geom_hline(yintercept = 0.8, color = "tomato", linetype = "dotted", size = 2) +
  geom_line(data = trajectory,
            aes(x = time, y = depletion, group = factor(seed * quantile), color = quantile)
            ) +
  ylab("Depletion (as % of K)") +
  theme(legend.position = "bottom")

The black dashed line shows what would be the population dynamics without any anthropogenic removals. The blue curves are the different population dynamics according to which quantile is used to set the catch limit: the smaller the quantile, the smaller the catch limit. The red dotted line shows the conservation objective, which is to reach \(80\)% of \(\mathrm{K}\) after \(100\) years. In this case, only the quantile \(q<0.3\) allow to reach the conservation objective. Note also the similar shape across the different curves. This stems from using random = FALSE above: the same seed is recycled through the multiple calls to forward_rla. This corresponds to counterfactuals, ie instances of the same scenarios save one input (the chosen quantile).

These results are also summarize in the managementtable which is an output from a call to forward_rla. We can focused on the last row of these tables to assess rapidly the conservation output at the time horizon of \(100\) years.

results %>% 
  mutate(bycatch = floor(catch_limit * N_hat)) %>%
  group_by(quantile) %>% 
  slice_tail() %>%
  kable()
catch_limit N_hat depletion K quantile catastrophe convergence seed t current_depletion MNP MNPL level cv_obs cv_env cv_byc rate bycatch
0.00436 420107 0.8426480 5e+05 0.20 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 1831
0.00378 426389 0.8397495 5e+05 0.25 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 1611
0.00855 372445 0.7490747 5e+05 0.30 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 3184
0.00836 375871 0.7024536 5e+05 0.35 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 3142
0.00552 444149 0.6941230 5e+05 0.40 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 2451
0.01486 275094 0.6449841 5e+05 0.45 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 4087
0.00148 170262 0.5038686 5e+05 0.50 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 251
0.01747 215613 0.4696747 5e+05 0.55 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 3766
0.01336 202636 0.4862323 5e+05 0.60 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 2707
0.02439 265945 0.3701694 5e+05 0.65 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 6486
0.01657 185633 0.4553557 5e+05 0.70 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 3075
0.02529 203223 0.4309811 5e+05 0.75 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 5139
0.02517 187824 0.3384123 5e+05 0.80 0 1 271317 17 0.54172 1.02 0.5 (0.4,0.6] 0.231 0.052 0.195 0.014 4727

This table has many columns summarizing the simulations. Conservation objectives can be assessed with the depletion column for example. The management action is the determination of the catch limit, which if multiplied by the current best available abundance estimate N_hat will gives the catch limit in number of animals. In this case, only the quantiles \(q<0.2\) allows to reach the conservation objective:

results %>% 
  group_by(quantile) %>% 
  slice_tail() %>%
  group_by(MNP, MNPL, level, quantile) %>%
  summarize(conservation = ifelse(depletion > 0.8, 1, 0),
            depletion = depletion
            ) %>% 
  ggplot() +
  geom_hline(yintercept = 0.8, linetype = "dotted", color = "tomato", size = 2) +
  geom_line(aes(x = quantile, y = depletion), color = "midnightblue", size = 2) +
  theme_bw()
#> `summarise()` has grouped output by 'MNP', 'MNPL', 'level'. You can override using the `.groups` argument.

To properly tune the RLA, many simulations should be done, and results should be averaged over these many simulations. This is the purpose of the inputs in table scenarios_north_sea_hp where information on \(100\) simulations for several scenarios are stored. Results from these should then be averaged to determine which quantile to use.