Abundance estimates for all species in Parita Bay, Panama

Site abundance prediction per species

Authors

Diego J. Lizcano

Jorge Velásquez-Tibata

Published

May 17, 2024

Fit the occupancy model of Royle and Nichols (2003), which relates probability of detection of the species to the number of individuals available for detection at each site.

Royle, J. A. and Nichols, J. D. (2003) Estimating Abundance from Repeated Presence-Absence Data or Point Counts. Ecology, 84(3) pp. 777–790.

The number of animals available for detection at site i is modelled as Poisson: \[ N_1 \sim Poisson(\lambda_i) \]

We assume that all individuals (of same species) at site i during sample j have identical detection probabilities, \(r_{i,j}\) and and that detections are independent. The species will be recorded if at least one individual is detected. Thus, the detection probability for the species is linked to the detection probability for an individual by:

\[ p_{i,j} = 1-(1-r_{i,j})^{N_i} \] The equation for the detection history is then: \[ y_{i,j} \sim Bernoulli(p_{i,j}) \] We used the unmarked package to make the estimates.

Read data

We read the data from excel and csv.

We prepared the data making a list of detection histories per species using the custom function f.matrix.creator2 and later collapsing the detection histories to 3 days using the custom function f.shrink.matrix.h.to9

Code
# load the custom functions to create a list of detection histories per species 
source("C:/CodigoR/AudubonPanama/R/matrix_creator.R")


############### load Packages
library(readr) # readr 
library(readxl) # read excel
library(plyr) 
library(dplyr)

library(unmarked) # occupancy by maximum likelihood
library(sf) # simple feature to get coords
library(DT) # to make data table with buttons
library(terra) # to convert sf to vector
library(tidyverse) #  some times 

############### load data
# old 
# parita_data_full <- read_delim("C:/CodigoR/AudubonPanama/data/all_records_audubon_panama_to_occu.csv", 
#                               delim = "\t", escape_double = FALSE, 
#                               col_types = cols(eventDate = col_date(format = "%Y-%m-%d")), 
#                               trim_ws = TRUE)

# new
parita_data_full <- read_excel("C:/CodigoR/AudubonPanama/data/all_records_audubon_panama_Darwin_Core_revision.xlsx", 
    col_types = c("text", "text", "text", 
        "date", "text", "text", "numeric", 
        "numeric", "text", "text", "numeric", 
        "text", "numeric", "numeric", "text", 
        "numeric", "numeric", "numeric", 
        "numeric", "numeric", "text", "text", 
        "numeric", "text", "numeric", "text", 
        "text", "text", "text", "text", "numeric", 
        "text", "numeric", "text", "numeric", 
        "text", "text"))

covs <- read_csv("C:/CodigoR/AudubonPanama/shp/sites_covs_parita_nona.csv")

## Nyctidromus albicollis
## Aramus guarauna
## Buteogallus anthracinus
## Dryocopus lineatus
## Piranga rubra

# filter by threshold and species
parita_data_allsp <-  parita_data_full %>% 
                      filter(confidence >= threshold)# %>%
                    # filter(scientificName == c("Nyctidromus albicollis", 
                    #                             "Aramus guarauna",
                    #                             "Buteogallus anthracinus",
                    #                             "Dryocopus lineatus",
                    #                             "Piranga rubra"
                    #                             ))  

# insert Max and min to get matrix limits
# start_date
parita_data2 <- parita_data_allsp %>% 
  group_by(locationID) %>% 
  mutate(start_date = min(eventDate),
         end_date= max(eventDate))

# apply function to get matrix per species
parita <- f.matrix.creator2(data = parita_data2, year = 2023)


# # centroide for terra
# # centoide <- centroids(puntos, TRUE)
# centroide <- c(mean(as.numeric(parita_data_allsp$decimalLongitude)), mean(as.numeric(parita_data_allsp$decimalLatitude)))
# clip_window <- extent(-80.67105, -80.05822, 7.713744, 8.489888) # 7.932311, -80.612233  8.360073, -80.180109
# bb <- c(-80.612233, -80.180109, 7.932311, 8.360073)
# srtm <- raster::getData('SRTM', lon=centroide[1], lat=centroide[2])

# crop the  raster using the vector extent
# srtm_crop <- raster::crop(srtm, clip_window)
 #rast(srtm_crop) # convert to terra


# altitud <- elevation_3s(-72.893262, 7.664081007, path="data")
# altitud <- as.numeric(Camptostoma_obsoletum_occu_dia$Altitude)

# convierte covs a puntos terra
puntos <- vect(covs, geom=c("Longitude", "Latitude"), crs="EPSG:4326")
# convierte a sf
puntos_sf <- sf::st_as_sf(puntos)

######### precipitacion
precip_Anton <- read_csv("C:/CodigoR/AudubonPanama/data/Precip_Anton.csv")
precip <-  data.frame(f.shrink.matrix.h.to9(precip_Anton))
precip[,1] <- 0 #fix start


#

Function to get abundance

These methods return an object storing the posterior distributions of the latent variables at each site. We used the empirical Bayes methods used by unmarked::ranef which can underestimate the variance of the posterior distribution because they do not account for uncertainty in the hyperparameters (lambda). Note also that the posterior mode appears to exhibit some bias as an estimator or abundance. Consider using the posterior mean instead, even though it will not be an integer in general.

One species, single season model in a custom function

The function ranef_by_sp calculates the abundance per species.

Code
ranef_by_sp <- function(spx=1){ suppressWarnings({ #species name
# Make UMF object
umf_y_full_spx<- unmarkedFrameOccu(y=f.shrink.matrix.to9( # y collapsed to 3days
                                   as.data.frame(parita$occur[[spx]])),
                                   siteCovs=NULL,
                                   obsCovs=list(prec=precip,
                            hour= data.frame(hour = f.shrink.matrix.h.to9(parita$dete[[spx]]))
                                                )
                                   ) 


# fit unmarked
fit_unm_spx_0 <- unmarked::occuRN(~1~1, data=umf_y_full_spx) # ok!
fit_unm_spx_p <- unmarked::occuRN(~ scale(prec) ~1, data=umf_y_full_spx) # ok!
fit_unm_spx_h <- unmarked::occuRN(~ scale(hour) ~1, data=umf_y_full_spx) # ok!


# fit list for occupancy
fms2<-fitList("p(.) lambda(.)"=fit_unm_spx_0,
              "p(prec) lambda(.)"=fit_unm_spx_p,
              "p(hour) lambda(.)"=fit_unm_spx_h)
              # #"p(h) Ocu(.)"=fit_unm_sp2_0h, #,
              # #"p(c) Ocu(.)"=fit_unm_sp2_0c,
              # "p(.) lambda(BGB_Spawn)"=fit_unm_sp2_1,
              # "p(.) lambda(human_foot)"=fit_unm_sp2_2,
              # "p(.) lambda(NDVI)"=fit_unm_sp2_3,
              # "p(.) lambda(river)"=fit_unm_sp2_4,
              # #"p(.) Ocu(river^2)"=fit_unm_sp2_4s,
              # "p() lambda(carbon)"=fit_unm_sp2_5,
              # "p() lambda(canopy)"=fit_unm_sp2_6,
              # "p() lambda(canopy+BGB_Spawn)"=fit_unm_sp2_7,
              # "p() lambda(BGB_Spawn+human_foot)"=fit_unm_sp2_8,
              # "p() lambda(ALL)"=fit_unm_sp2_9
              # )

# model selection detection unmarked
 models <- modSel(fms2)
 models
 

# Empirical Bayes estimates of abundance at each site by species
re_0 <- ranef(fit_unm_spx_0)
re_h <- ranef(fit_unm_spx_h)
# plot(re)
# post.df <- as(re_0, "data.frame")

#####################################################
# to save the ranef result (S3 Object) as dataframe #
#####################################################

abund_per_site <- data.frame(sp=names(parita$dete[spx]), 
                             site=names( parita$occu[[1]][,1]),
                             mode=(bup(re_0, stat="mode") ),
                             mean=(bup(re_0, stat="mean") ), 
                             confint(re_0, level=0.9))

}) # no warning part
  
  
# return(list(paste("###", sp=names(parita$dete[spx]) ), # sp name
#             models, # model AICs
#             Null=re_0, # mull model estimates
#             hour=re_h)) # hour in detection estimates

  # return to export table
  return(abund_per_site)
  
} # end of function

use the function

Runs a loop on a list containing the species, and later making a table of species sites and abundances.

Table of abundances

Code
# library (plyr)
# Convert the list to a dataframe
df <- ldply(abun_site, data.frame)

# save to table
# write.csv(df, "C:/CodigoR/AudubonPanama/data/abundance_result.csv")

# make table with buttons 
datatable(
  df, extensions = 'Buttons', options = list(
    dom = 'Bfrtip',
    buttons = c('copy', 'csv')
  )
)

Information of R session.

Code
print(sessionInfo(), locale = FALSE)
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default


attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   purrr_1.0.2    
 [5] tidyr_1.3.1     tibble_3.2.1    ggplot2_3.5.0   tidyverse_2.0.0
 [9] terra_1.7-71    DT_0.32         sf_1.0-15       unmarked_1.4.1 
[13] dplyr_1.1.4     plyr_1.8.9      readxl_1.4.3    readr_2.1.5    

loaded via a namespace (and not attached):
 [1] gtable_0.3.4       bslib_0.6.1        xfun_0.42          htmlwidgets_1.6.4 
 [5] lattice_0.22-5     tzdb_0.4.0         crosstalk_1.2.1    vctrs_0.6.5       
 [9] tools_4.3.2        generics_0.1.3     parallel_4.3.2     proxy_0.4-27      
[13] fansi_1.0.6        pkgconfig_2.0.3    Matrix_1.6-1.1     KernSmooth_2.23-22
[17] lifecycle_1.0.4    compiler_4.3.2     munsell_0.5.0      codetools_0.2-19  
[21] sass_0.4.8         htmltools_0.5.7    class_7.3-22       yaml_2.3.8        
[25] jquerylib_0.1.4    nloptr_2.0.3       crayon_1.5.2       pillar_1.9.0      
[29] ellipsis_0.3.2     MASS_7.3-60        classInt_0.4-10    cachem_1.0.8      
[33] boot_1.3-28.1      nlme_3.1-163       tidyselect_1.2.1   digest_0.6.34     
[37] stringi_1.8.3      splines_4.3.2      fastmap_1.1.1      grid_4.3.2        
[41] colorspace_2.1-0   cli_3.6.2          magrittr_2.0.3     utf8_1.2.4        
[45] e1071_1.7-14       withr_3.0.0        scales_1.3.0       bit64_4.0.5       
[49] timechange_0.3.0   rmarkdown_2.26     lme4_1.1-35.1      bit_4.0.5         
[53] cellranger_1.1.0   hms_1.1.3          evaluate_0.23      knitr_1.45        
[57] rlang_1.1.3        Rcpp_1.0.12        glue_1.7.0         DBI_1.2.2         
[61] minqa_1.2.6        rstudioapi_0.16.0  vroom_1.6.5        jsonlite_1.8.8    
[65] R6_2.5.1           units_0.8-5