BFI - Carbono, Parita Bay, Panama

Diego Lizcano; Jorge Velásquez-Tibata

BFI - Carbono, Parita Bay, Panama

A model selection approach

Authors

Diego Lizcano

Jorge Velásquez-Tibata

Published

June 25, 2024

Question

Is carbon, measured at AGBsubplot a good predictor of BFI?

Set up analysis

Load libraries and set some options.

Code

library(readxl) # Read Excel Files # Read Excel Files
library(gt) # Easily Create Presentation-Ready Display Tables
library(lubridate) # Make Dealing with Dates a Little Easier
library(stringr) # Simple, Consistent Wrappers for Common String Operations
library(readxl) # Read Excel Files # Read Excel Files
library(sf) # Simple Features for R
library(MuMIn) # multimodel inference
library(metafor) # Meta-Analysis Package for R
eval(metafor:::.MuMIn) # helper functions we need so that MuMIn and metafor can interact 
library(visreg) # see trend

library(terra) # Spatial Data Analysis
library(sjPlot) # Data Visualization for Statistics in Social Science
# library(mapview)
library(corrplot) # Visualization of a Correlation Matrix
library(DT) # A Wrapper of the JavaScript Library 'DataTables'
library(grateful) # Facilitate Citation of R Packages
library(geodata) # worlclim data

library(see) # Model Visualisation Toolbox for 'easystats' and 'ggplot2'
library(performance) # Assessment of Regression Models Performance

library(tidyverse) # Easily Install and Load the 'Tidyverse'
library(ggeffects) # Create Tidy Data Frames of Marginal Effects for 'ggplot' from


options(scipen=99999)
options(max.print=99999)
options(stringsAsFactors=F)

Load Data

Code


Subplots_coord <- read_excel("C:/CodigoR/AudubonPanama/data/2024 may 29 Velasquez Muestreo Acustico ParB jhs data.xlsx") |> mutate(site=Name) 

Subplots_coord2 <- read_excel("C:/CodigoR/AudubonPanama/data/2024 may 30 Hoyos ParB Subplots Acustinc monitoring.xlsx") |> mutate(site=plotId) 


covs_manyraster <- rast("C:/CodigoR/AudubonPanama/raster/covs_many_raster.tif")


# covs <- read_csv("C:/CodigoR/AudubonPanama/shp/sites_covs_parita_nona.csv") |> mutate(site=Name) 

BFI_site<- read.csv("C:/CodigoR/AudubonPanama/data/BFI_site.csv", header = TRUE) |> left_join(Subplots_coord)
# convierte covs a puntos terra
# puntos <- vect(BFI_site, geom=c("Longitude", "Latitude"), crs="EPSG:4326")
# convierte a sf
covs_all<- sf::st_as_sf(BFI_site, coords = c("Long", "Lat"),
           crs = "EPSG:4326")

Carbono <- sf::st_as_sf(Subplots_coord2, coords = c("Long", "Lat"),
           crs = "EPSG:4326")

# see which is closer
# mapview(Audiomoths, alpha = 0) + mapview(Carbono)

# get extra Bio vars 9 and 17 requested by Jorge
bios <- worldclim_tile(var = "bio", res = 0.5,
  lat=8.000535, lon=-80.39227, path = "C:/CodigoR/AudubonPanama/raster/")

bio_covs <- terra::extract(bios, covs_all)
otherrast_covs <- terra::extract(covs_manyraster, covs_all)


covs_all$bio9 <- bio_covs$tile_28_wc2.1_30s_bio_9
covs_all$bio17 <- bio_covs$tile_28_wc2.1_30s_bio_17
covs_all$forest_integrity<- otherrast_covs$forest_integrity
covs_all$NDVI<- otherrast_covs$NDVI

# remove tology
covs_all <- as.data.frame(covs_all)[,-33]
# LUC_Impact

Correlation in sites (points) where the Audiomoth was installed

Code

# extract values from raster 
# covs_all1 <- terra::extract(covs_many_raster, puntos) 

# saved to avoid conflict between terra::extract and dplyr
# saveRDS(covs_all1, "C:/CodigoR/AudubonPanama/data/BFI/covs_all.RDS")
# save(covs_all, file = "C:/CodigoR/AudubonPanama/data/BFI/covs_all.Rda")
# covs_all <- readRDS("C:/CodigoR/AudubonPanama/data/BFI/covs_all.RDS")
#load the rda file
# covs_all <- load(file = "C:/CodigoR/AudubonPanama/data/BFI/covs_all.Rda")
# write.csv(covs_all)

# covs_all$site <- puntos$site # ad site name

# change NA to 0 
# covs_all <- substr(covs_all, NA, 0) #replace in terra
# covs_all <- BFI_site
covs_all[is.na(covs_all)] <- 0
M = cor(covs_all[,c(18:27, 30:32)]) # removes ID and site an many others 
corrplot(M)

covariate removed: [25] “Average TC top 50 cm”; [26] “Average TN top 50 cm”; [22] “BGC”

Which model predicts BFI the best?

Information-theoretic approaches provide methods for model selection and (multi)model inference that differ quite a bit from more traditional methods based on null hypothesis testing (e.g., Anderson, 2008; Burnham & Anderson, 2002). These methods can also be used in the meta-analytic context when model fitting is based on likelihood methods.

We will now examine the fit and plausibility of various models, focusing on models that contain none, one, and up to seven of these possible predictors covariates of BFI.

With level = 1, we stick to models with main effects only. This implies that there are \(2^7\) = 128 possible models in the candidate set to consider. Since we want to keep the results for all these models (the default is to only keep up to 100 model fits), We set confsetsize=128. With crit=“AIC”, we select the Akaike Information Criterion, in this case: the AIC that we would like to compute for each model and that should be used for model selection and multimodel inference.

lets put all data BFI and all covariates in the same table

Code


# put in a table
dat1 <- covs_all# BFI_site # |> left_join(covs_all)
# dat <- dat1 |> dplyr::select("bfi_unscaled",  
#                              "DBH_cm", 
#                              "H_m", 
#                              "AGBplot", 
#                              "AGBsubplot", 
#                              "BGC", 
#                              "Phosphorus", "pHw", "Conductivity", "LUC_Impact", "Mangrove_typology") #bfi_scaled_scale
dat <- dat1[,c(5,19,20,21,26,28,30,31,32)]

Next, we fit all the posible models with function dredge

Now we can fit all 128 models and examine those models whose AICc value is no more than 2 units away from that of the best model.

Code


library(MASS) # stepAIC

full <- lm(bfi_unscaled~., data=dat)


# Now we can fit all 128 models and examine those models whose AICc value is no more than 2 units away from that of the best model with:
options(na.action = "na.fail")
res <- dredge(full, 
              rank = "AIC", # can be AICc as well
              beta="sd", 
              # extra = c("R^2"),
              m.lim = c(1, 2))
# subset(res, delta <= 2, recalc.weights=FALSE)
res # summary(model.avg(res))
#> Global model call: lm(formula = bfi_unscaled ~ ., data = dat)
#> ---
#> Model selection table 
#>     (Int)    AGBp     AGBs      b17     bi9 frs_int     H_m LUC_Imp      pHw df
#> 97      0                                           -0.5802       +           7
#> 161     0                                           -0.5890         -0.47180  4
#> 49      0                                   0.44750 -0.5232                   4
#> 67      0         -0.96090                                        +           7
#> 33      0                                           -0.3862                   3
#> 35      0          0.35970                          -0.5844                   4
#> 34      0 0.28970                                   -0.4314                   4
#> 17      0                                   0.28730                           3
#> 25      0                           0.32290 0.44090                           4
#> 2       0 0.22250                                                             3
#> 129     0                                                           -0.21860  3
#> 41      0                           0.04567         -0.3781                   4
#> 37      0                  -0.04250                 -0.3741                   4
#> 5       0                  -0.14950                                           3
#> 9       0                           0.11310                                   3
#> 3       0          0.03779                                                    3
#> 69      0                  -0.44610                               +           7
#> 19      0         -0.18000                  0.38820                           4
#> 133     0                  -0.21850                                 -0.27370  4
#> 18      0 0.07368                           0.24200                           4
#> 21      0                  -0.05050         0.26870                           4
#> 145     0                                   0.25970                 -0.04014  4
#> 10      0 0.27390                   0.18800                                   4
#> 13      0                  -0.27700 0.25280                                   4
#> 130     0 0.16060                                                   -0.15420  4
#> 4       0 0.32330 -0.16270                                                    4
#> 137     0                           0.13430                         -0.23100  4
#> 131     0         -0.10060                                          -0.27020  4
#> 6       0 0.19380          -0.08006                                           4
#> 65      0                                                         +           6
#> 7       0          0.05736 -0.15670                                           4
#> 11      0          0.07195          0.13170                                   4
#> 73      0                           0.31900                       +           7
#> 193     0                                                         +  0.34630  7
#> 66      0 0.13060                                                 +           7
#> 81      0                                   0.05331               +           7
#>       logLik   AIC delta weight
#> 97  -100.970 215.9  0.00  0.144
#> 161 -104.038 216.1  0.14  0.134
#> 49  -104.038 216.1  0.14  0.134
#> 67  -101.498 217.0  1.06  0.085
#> 33  -105.957 217.9  1.98  0.054
#> 35  -105.062 218.1  2.18  0.048
#> 34  -105.148 218.3  2.36  0.044
#> 17  -106.561 219.1  3.18  0.029
#> 25  -105.825 219.6  3.71  0.023
#> 2   -106.844 219.7  3.75  0.022
#> 129 -106.858 219.7  3.78  0.022
#> 41  -105.938 219.9  3.94  0.020
#> 37  -105.942 219.9  3.94  0.020
#> 5   -107.069 220.1  4.20  0.018
#> 9   -107.147 220.3  4.35  0.016
#> 3   -107.238 220.5  4.54  0.015
#> 69  -103.363 220.7  4.79  0.013
#> 19  -106.365 220.7  4.79  0.013
#> 133 -106.473 220.9  5.01  0.012
#> 18  -106.531 221.1  5.12  0.011
#> 21  -106.541 221.1  5.14  0.011
#> 145 -106.553 221.1  5.17  0.011
#> 10  -106.563 221.1  5.19  0.011
#> 13  -106.669 221.3  5.40  0.010
#> 130 -106.674 221.3  5.41  0.010
#> 4   -106.705 221.4  5.47  0.009
#> 137 -106.706 221.4  5.47  0.009
#> 131 -106.795 221.6  5.65  0.009
#> 6   -106.796 221.6  5.65  0.009
#> 65  -104.933 221.9  5.93  0.007
#> 7   -107.042 222.1  6.15  0.007
#> 11  -107.108 222.2  6.28  0.006
#> 73  -104.380 222.8  6.82  0.005
#> 193 -104.383 222.8  6.83  0.005
#> 66  -104.833 223.7  7.73  0.003
#> 81  -104.919 223.8  7.90  0.003
#> Models ranked by AIC(x)
plot(res)

The graph shows the number of times the variable was selected in each model

Now we refit best linear model

Code


# 8. Refit best linear model
bestmodel <- get.models(res, 1)[[1]]
z <- lm(bestmodel, data = dat)
tab_model(summary(z), bootstrap=TRUE)

	bfi_unscaled
Predictors	Estimates	CI	p
(Intercept)	2021.46	1780.18 – 2570.65	<0.001
H m	-19.72	-42.60 – -6.83	<0.001
LUC Impact [Agriculture]	109.05	-191.43 – 380.64	0.335
LUC Impact [Agriculture and shrimp farms]	-80.30	-361.80 – 192.82	0.576
LUC Impact [River urban discharge]	-209.96	-655.53 – 8.61	0.057
LUC Impact [Shrimp farms]	-174.00	-600.84 – 150.43	0.271
Observations	16
R² / R² adjusted	0.544 / 0.316

According to lowest AIC of 128 models, the Best model is:

\[ BFIunscaled \sim H_m + LUC impact + intercept \]

Checking assumptions

Code



# result2 <- check_normality(z)
# plot(result2, type = "density")
# plot(result2, type = "qq")
# plot(result2, type = "pp")
# 
# result3 <- check_heteroscedasticity(z)
# plot(result3)
# check_predictions(z, check_range = TRUE)


out <- check_model(z)
plot(out, type = "discrete_both")

In seems our “best” model, according to the AIC is z = \[BFIunscaled \sim H_m + LUC impact + intercept\]

Do not meet normality fully, and the posterior predictive check is not that good. But normality is that important?. Lets relax that assumption…

is there any interaction?

Lets Include all possible two-way interactions between: H_m, pHw, AGBsubplot and forest_integrity to finally select the model with lowest AIC.

Code


# put in a table
# dat2 <- dat1 |>  dplyr::select("bfi_unscaled", 
#                               "H_m" ,
#                               `BGC`,
#                                 "pHw")

dat2 <- dat1[,c(5,19,21,26,32)]
# 12. Include all two-way interactions
z2full <- lm(bfi_unscaled ~ (.)^2, data = dat2)
z2 <- stepAIC(z2full, upper = ~., lower = ~1, direction = "both")
#> Start:  AIC=172.6
#> bfi_unscaled ~ (H_m + AGBsubplot + pHw + forest_integrity)^2
#> 
#>                               Df Sum of Sq    RSS    AIC
#> - pHw:forest_integrity         1       271 196109 170.62
#> - H_m:pHw                      1      9194 205031 171.33
#> - H_m:AGBsubplot               1      9384 205221 171.35
#> - AGBsubplot:pHw               1     14818 210655 171.77
#> - AGBsubplot:forest_integrity  1     22713 218550 172.35
#> <none>                                     195837 172.60
#> - H_m:forest_integrity         1     49051 244888 174.18
#> 
#> Step:  AIC=170.62
#> bfi_unscaled ~ H_m + AGBsubplot + pHw + forest_integrity + H_m:AGBsubplot + 
#>     H_m:pHw + H_m:forest_integrity + AGBsubplot:pHw + AGBsubplot:forest_integrity
#> 
#>                               Df Sum of Sq    RSS    AIC
#> - H_m:pHw                      1      9095 205204 169.35
#> - H_m:AGBsubplot               1     14051 210160 169.73
#> - AGBsubplot:pHw               1     18980 215089 170.10
#> - AGBsubplot:forest_integrity  1     22448 218556 170.35
#> <none>                                     196109 170.62
#> - H_m:forest_integrity         1     48973 245081 172.19
#> + pHw:forest_integrity         1       271 195837 172.60
#> 
#> Step:  AIC=169.35
#> bfi_unscaled ~ H_m + AGBsubplot + pHw + forest_integrity + H_m:AGBsubplot + 
#>     H_m:forest_integrity + AGBsubplot:pHw + AGBsubplot:forest_integrity
#> 
#>                               Df Sum of Sq    RSS    AIC
#> - AGBsubplot:pHw               1     12355 217559 168.28
#> - H_m:AGBsubplot               1     13190 218394 168.34
#> - AGBsubplot:forest_integrity  1     14877 220081 168.47
#> <none>                                     205204 169.35
#> - H_m:forest_integrity         1     44947 250151 170.52
#> + H_m:pHw                      1      9095 196109 170.62
#> + pHw:forest_integrity         1       172 205031 171.33
#> 
#> Step:  AIC=168.28
#> bfi_unscaled ~ H_m + AGBsubplot + pHw + forest_integrity + H_m:AGBsubplot + 
#>     H_m:forest_integrity + AGBsubplot:forest_integrity
#> 
#>                               Df Sum of Sq    RSS    AIC
#> - AGBsubplot:forest_integrity  1      7210 224769 166.80
#> - pHw                          1      7478 225037 166.82
#> - H_m:AGBsubplot               1     13394 230953 167.24
#> <none>                                     217559 168.28
#> + AGBsubplot:pHw               1     12355 205204 169.35
#> - H_m:forest_integrity         1     50698 268257 169.63
#> + pHw:forest_integrity         1      6459 211100 169.80
#> + H_m:pHw                      1      2470 215089 170.10
#> 
#> Step:  AIC=166.8
#> bfi_unscaled ~ H_m + AGBsubplot + pHw + forest_integrity + H_m:AGBsubplot + 
#>     H_m:forest_integrity
#> 
#>                               Df Sum of Sq    RSS    AIC
#> - pHw                          1      4440 229209 165.12
#> - H_m:AGBsubplot               1     11742 236511 165.62
#> <none>                                     224769 166.80
#> + AGBsubplot:forest_integrity  1      7210 217559 168.28
#> + H_m:pHw                      1      5428 219341 168.41
#> + AGBsubplot:pHw               1      4688 220081 168.47
#> + pHw:forest_integrity         1      4130 220639 168.51
#> - H_m:forest_integrity         1     60653 285422 168.63
#> 
#> Step:  AIC=165.12
#> bfi_unscaled ~ H_m + AGBsubplot + forest_integrity + H_m:AGBsubplot + 
#>     H_m:forest_integrity
#> 
#>                               Df Sum of Sq    RSS    AIC
#> - H_m:AGBsubplot               1      7670 236880 163.64
#> <none>                                     229209 165.12
#> + pHw                          1      4440 224769 166.80
#> + AGBsubplot:forest_integrity  1      4172 225037 166.82
#> - H_m:forest_integrity         1    104243 333453 169.12
#> 
#> Step:  AIC=163.64
#> bfi_unscaled ~ H_m + AGBsubplot + forest_integrity + H_m:forest_integrity
#> 
#>                               Df Sum of Sq    RSS    AIC
#> <none>                                     236880 163.64
#> + H_m:AGBsubplot               1      7670 229209 165.12
#> + AGBsubplot:forest_integrity  1      4555 232325 165.33
#> + pHw                          1       369 236511 165.62
#> - AGBsubplot                   1     68779 305658 165.72
#> - H_m:forest_integrity         1    172916 409796 170.41
tab_model(summary(z2))

	bfi_unscaled
Predictors	Estimates	CI	p
(Intercept)	2374.34	1850.24 – 2898.43	<0.001
H m	-74.95	-120.89 – -29.01	0.004
AGBsubplot	1.01	-0.23 – 2.25	0.101
forest integrity	-105.07	-215.97 – 5.82	0.061
H m × forest integrity	9.40	2.10 – 16.69	0.016
Observations	16
R² / R² adjusted	0.619 / 0.481

The answer is: it seems to be an interaction between H_m, BGC and AGBsubplot So another plausible model is z2 = \[BFI unscaled \sim H_m + AGBsubplot + Forest Integrity + H_m:Forest Integrity\]

Which model can be the best?

Lets compare the fix effect model performance of z1 and z2 and several aditional models including LUC_Impact

We use the models:

z1 <- lm(bfi_unscaled ~ H_m + LUC impact + intercept, dat).
z2 <- lm(bfi_unscaled ~ H_m + AGBsubplot + forest_integrity + H_m:forest_integrity), dat).
z3 <- lm(bfi_unscaled ~ H_m + AGBsubplot + forest_integrity + H_m:forest_integrity + LUC_Impact), dat).
z4 <- lm(bfi_unscaled ~ AGBsubplot + H_m * LUC_Impact, dat).
z5 <- lm(bfi_unscaled ~ AGBsubplot + H_m + forest_integrity + LUC_Impact, dat).
z6 <- lm(bfi_unscaled ~ (H_m)^2 * AGBsubplot + LUC_Impact, dat).

Code


z3 <- lm(bfi_unscaled ~ H_m + AGBsubplot + forest_integrity + H_m:forest_integrity + LUC_Impact, dat)
z4 <- lm(bfi_unscaled ~ AGBsubplot + H_m * LUC_Impact, dat)
z5 <- lm(bfi_unscaled ~ AGBsubplot + H_m + forest_integrity + LUC_Impact, dat)
z6 <- lm(bfi_unscaled ~ (H_m)^2 * AGBsubplot + LUC_Impact, dat)

result1 <- compare_performance(z,z2,z3,z4,z5,z6,  rank=TRUE)

DT::datatable(result1)

Code

plot(result1)

Code


# test_performance(z, z2, z3)
# test_bf(z, z2, z3)
# lmtest::lrtest(z, z2, z3)

Larger values indicate better model performance. Hence, points closer to the center indicate worse fit indices.

best model is again z2

\[ BFIunscaled \sim H_m + AGBsubplot + ForestIntegrity + H_m:ForestIntegrity \]

Code

sjPlot::tab_model(z2)

	bfi_unscaled
Predictors	Estimates	CI	p
(Intercept)	2374.34	1850.24 – 2898.43	<0.001
H m	-74.95	-120.89 – -29.01	0.004
AGBsubplot	1.01	-0.23 – 2.25	0.101
forest integrity	-105.07	-215.97 – 5.82	0.061
H m × forest integrity	9.40	2.10 – 16.69	0.016
Observations	16
R² / R² adjusted	0.619 / 0.481

Lets see how well z2 meet normality asumptions

Code


result4 <- check_normality(z2)
plot(result4, type = "density")

Code

plot(result4, type = "qq")

Code

plot(result4, type = "pp")

Code


out3 <- check_model(z2, panel=TRUE)
plot(out3, type = "discrete_both")

Lets see the model coeficients and predictions for z2

Code


# visreg(z2, xvar = c("DBH_cm", "H_m", "pHw"))
dat4 <- predict_response(z2, terms = c("AGBsubplot", "H_m", "forest_integrity"))
plot(dat4, facets = TRUE)

Code




plot_model(z2, type = "est")

Code

# plot_model(lm2, type = "re") # in case random effect
plot_model(z2, type = "pred")
#> $H_m

#> 
#> $AGBsubplot

#> 
#> $forest_integrity

Lets try some mix models with random effects in LUC_Impact

We used the models:

lm4 <- glmer(bfi_unscaled ~ AGBsubplot + (1|LUC_Impact), data = dat).
lm5 <- glmer(bfi_unscaled ~ AGBsubplot + forest_integrity + (1|LUC_Impact), data =dat).
lm6 <- glmer(bfi_unscaled ~ AGBsubplot + H_m + (1|LUC_Impact), data = dat).
lm7 <- glmer(bfi_unscaled ~ AGBsubplot + H_m + LUC_Impact + (1|LUC_Impact), data = dat).
lm8 <- glmer(bfi_unscaled ~ AGBsubplot * LUC_Impact + H_m + (1|LUC_Impact), data = dat).
lm9 <- glmer(bfi_unscaled ~ AGBsubplot + H_m + LUC_Impact + (1|LUC_Impact), data = dat).

Code


library(lme4)

lm4 <- glmer(bfi_unscaled ~ AGBsubplot + (1|LUC_Impact), data =  dat)
lm5 <- glmer(bfi_unscaled ~ AGBsubplot + forest_integrity + (1|LUC_Impact), data =  dat)
lm6 <- glmer(bfi_unscaled ~ AGBsubplot + H_m + (1|LUC_Impact), data =  dat)
lm7 <- glmer(bfi_unscaled ~ AGBsubplot + H_m + LUC_Impact + (1|LUC_Impact), data =  dat)
lm8 <- glmer(bfi_unscaled ~ AGBsubplot * LUC_Impact + H_m + (1|LUC_Impact), data =  dat)
lm9 <- glmer(bfi_unscaled ~ AGBsubplot + H_m + LUC_Impact + (1|LUC_Impact), data =  dat)

Lets compare the mix models with random effects on LUC_Impact

Code




result2 <- compare_performance(lm4,lm5,lm6,lm7,lm8,lm9,  rank=TRUE)

DT::datatable(result2)

Code

plot(result2)

Code


# test_performance(z, z2, z3)
# test_bf(z, z2, z3)
# lmtest::lrtest(z, z2, z3)

best model is lm6

\[ BFIunscaled \sim AGBsubplot + H_m + (1|LUC impact) \]

Code

sjPlot::tab_model(lm6)

	bfi_unscaled
Predictors	Estimates	CI	p
(Intercept)	1861.87	1586.25 – 2137.49	<0.001
AGBsubplot	0.45	-0.96 – 1.86	0.497
H m	-19.39	-40.41 – 1.63	0.067
Random Effects
σ²	32231.01
τ₀₀ _{LUC_Impact}	6130.73
ICC	0.16
N _{LUC_Impact}	5
Observations	16
Marginal R² / Conditional R²	0.201 / 0.329

Lets see wow well lm6 meet model assumptions

Code


result2 <- check_normality(lm6)
plot(result2, type = "density")

Code

plot(result2, type = "qq")

Code

plot(result2, type = "pp")

Code


out4 <- check_model(lm6, panel=TRUE)
plot(out4, type = "discrete_both")

Lets see the model coeficients and predictions for lm6

Code


# visreg(z3, xvar = c("DBH_cm", "H_m", "pHw"))
dat2 <- predict_response(lm6, terms = c("AGBsubplot",  "H_m", "LUC_Impact"))
plot(dat2, facets = TRUE)

Code


plot_model(lm6, type = "est")

Code

# plot_model(lm2, type = "re") # in case random effect
plot_model(lm6, type = "pred")
#> $AGBsubplot

#> 
#> $H_m

Result of model selection

Code

library(report)
report::report(z2)
#> We fitted a linear model (estimated using OLS) to predict bfi_unscaled with
#> H_m, AGBsubplot and forest_integrity (formula: bfi_unscaled ~ H_m + AGBsubplot
#> + forest_integrity + H_m:forest_integrity). The model explains a statistically
#> significant and substantial proportion of variance (R2 = 0.62, F(4, 11) = 4.47,
#> p = 0.022, adj. R2 = 0.48). The model's intercept, corresponding to H_m = 0,
#> AGBsubplot = 0 and forest_integrity = 0, is at 2374.34 (95% CI [1850.24,
#> 2898.43], t(11) = 9.97, p < .001). Within this model:
#> 
#>   - The effect of H m is statistically significant and negative (beta = -74.95,
#> 95% CI [-120.89, -29.01], t(11) = -3.59, p = 0.004; Std. beta = -0.93, 95% CI
#> [-1.49, -0.37])
#>   - The effect of AGBsubplot is statistically non-significant and positive (beta
#> = 1.01, 95% CI [-0.23, 2.25], t(11) = 1.79, p = 0.101; Std. beta = 0.51, 95% CI
#> [-0.12, 1.15])
#>   - The effect of forest integrity is statistically non-significant and negative
#> (beta = -105.07, 95% CI [-215.97, 5.82], t(11) = -2.09, p = 0.061; Std. beta =
#> 0.25, 95% CI [-0.25, 0.76])
#>   - The effect of H m × forest integrity is statistically significant and
#> positive (beta = 9.40, 95% CI [2.10, 16.69], t(11) = 2.83, p = 0.016; Std. beta
#> = 0.74, 95% CI [0.17, 1.31])
#> 
#> Standardized parameters were obtained by fitting the model on a standardized
#> version of the dataset. 95% Confidence Intervals (CIs) and p-values were
#> computed using a Wald t-distribution approximation.
report::report(lm6)
#> We fitted a linear mixed model (estimated using REML and nloptwrap optimizer)
#> to predict bfi_unscaled with AGBsubplot and H_m (formula: bfi_unscaled ~
#> AGBsubplot + H_m). The model included LUC_Impact as random effect (formula: ~1
#> | LUC_Impact). The model's total explanatory power is substantial (conditional
#> R2 = 0.33) and the part related to the fixed effects alone (marginal R2) is of
#> 0.20. The model's intercept, corresponding to AGBsubplot = 0 and H_m = 0, is at
#> 1861.87 (95% CI [1586.25, 2137.49], t(11) = 14.87, p < .001). Within this
#> model:
#> 
#>   - The effect of AGBsubplot is statistically non-significant and positive (beta
#> = 0.45, 95% CI [-0.96, 1.86], t(11) = 0.70, p = 0.497; Std. beta = 0.23, 95% CI
#> [-0.49, 0.95])
#>   - The effect of H m is statistically non-significant and negative (beta =
#> -19.39, 95% CI [-40.41, 1.63], t(11) = -2.03, p = 0.067; Std. beta = -0.57, 95%
#> CI [-1.19, 0.05])
#> 
#> Standardized parameters were obtained by fitting the model on a standardized
#> version of the dataset. 95% Confidence Intervals (CIs) and p-values were
#> computed using a Wald t-distribution approximation.

fix effect model

We fitted a linear model (estimated using OLS) to predict bfi_unscaled with H_m, AGBsubplot and forest_integrity (formula: bfi_unscaled ~ H_m + AGBsubplot + forest_integrity + H_m:forest_integrity). The model explains a statistically significant and substantial proportion of variance (R2 = 0.62, F(4, 11) = 4.47, p = 0.022, adj. R2 = 0.48). The model’s intercept, corresponding to H_m = 0, AGBsubplot = 0 and forest_integrity = 0, is at 2374.34 (95% CI [1850.24, 2898.43], t(11) = 9.97, p < .001). Within this model:

The effect of H m is statistically significant and negative (beta = -74.95, 95% CI [-120.89, -29.01], t(11) = -3.59, p = 0.004; Std. beta = -0.93, 95% CI [-1.49, -0.37])
The effect of AGBsubplot is statistically non-significant and positive (beta = 1.01, 95% CI [-0.23, 2.25], t(11) = 1.79, p = 0.101; Std. beta = 0.51, 95% CI [-0.12, 1.15])
The effect of forest integrity is statistically non-significant and negative (beta = -105.07, 95% CI [-215.97, 5.82], t(11) = -2.09, p = 0.061; Std. beta = 0.25, 95% CI [-0.25, 0.76])
The effect of H m × forest integrity is statistically significant and positive (beta = 9.40, 95% CI [2.10, 16.69], t(11) = 2.83, p = 0.016; Std. beta = 0.74, 95% CI [0.17, 1.31])

Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using a Wald t-distribution approximation.

mixed effect model

We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict bfi_unscaled with AGBsubplot and H_m (formula: bfi_unscaled ~ AGBsubplot + H_m). The model included LUC_Impact as random effect (formula: ~1 | LUC_Impact). The model’s total explanatory power is substantial (conditional R2 = 0.33) and the part related to the fixed effects alone (marginal R2) is of 0.20. The model’s intercept, corresponding to AGBsubplot = 0 and H_m = 0, is at 1861.87 (95% CI [1586.25, 2137.49], t(11) = 14.87, p < .001). Within this model:

The effect of AGBsubplot is statistically non-significant and positive (beta = 0.45, 95% CI [-0.96, 1.86], t(11) = 0.70, p = 0.497; Std. beta = 0.23, 95% CI [-0.49, 0.95])
The effect of H m is statistically non-significant and negative (beta = -19.39, 95% CI [-40.41, 1.63], t(11) = -2.03, p = 0.067; Std. beta = -0.57, 95% CI [-1.19, 0.05])

Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using a Wald t-distribution approximation.

Conclusion

The answer to the initial question: Is carbon, measured at AGBsubplot a good predictor of BFI? is YES!

In both models (fix effects model with interaction and mixed effect) carbon at AGBsubplot shows a positive relationship with BFI. However the AGBsubplot in both “best” models is not significant.

Package Citation

Code

pkgs <- cite_packages(output = "paragraph", out.dir = ".")
knitr::kable(pkgs)

x
Bartoń K (2023). MuMIn: Multi-Model Inference. R package version 1.47.5, https://CRAN.R-project.org/package=MuMIn.
Bates D, Mächler M, Bolker B, Walker S (2015). “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01 https://doi.org/10.18637/jss.v067.i01.
Bates D, Maechler M, Jagan M (2023). Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.6-1.1, https://CRAN.R-project.org/package=Matrix.
Breheny P, Burchett W (2017). “Visualization of Regression Models Using visreg.” The R Journal, 9(2), 56-71.
Francisco Rodriguez-Sanchez, Connor P. Jackson (2023). grateful: Facilitate citation of R packages. https://pakillo.github.io/grateful/.
Gilbert P, Varadhan R (2019). numDeriv: Accurate Numerical Derivatives. R package version 2016.8-1.1, https://CRAN.R-project.org/package=numDeriv.
Grolemund G, Wickham H (2011). “Dates and Times Made Easy with lubridate.” Journal of Statistical Software, 40(3), 1-25. https://www.jstatsoft.org/v40/i03/.
Hijmans R (2024). terra: Spatial Data Analysis. R package version 1.7-71, https://CRAN.R-project.org/package=terra.
Hijmans RJ, Barbosa M, Ghosh A, Mandel A (2023). geodata: Download Geographic Data. R package version 0.5-9, https://CRAN.R-project.org/package=geodata.
Iannone R, Cheng J, Schloerke B, Hughes E, Lauer A, Seo J (2024). gt: Easily Create Presentation-Ready Display Tables. R package version 0.10.1, https://CRAN.R-project.org/package=gt.
Lüdecke D (2018). “ggeffects: Tidy Data Frames of Marginal Effects from Regression Models.” Journal of Open Source Software, 3(26), 772. doi:10.21105/joss.00772 https://doi.org/10.21105/joss.00772.
Lüdecke D (2024). sjPlot: Data Visualization for Statistics in Social Science. R package version 2.8.16, https://CRAN.R-project.org/package=sjPlot.
Lüdecke D, Ben-Shachar M, Patil I, Waggoner P, Makowski D (2021). “performance: An R Package for Assessment, Comparison and Testing of Statistical Models.” Journal of Open Source Software, 6(60), 3139. doi:10.21105/joss.03139 https://doi.org/10.21105/joss.03139.
Lüdecke D, Patil I, Ben-Shachar M, Wiernik B, Waggoner P, Makowski D (2021). “see: An R Package for Visualizing Statistical Models.” Journal of Open Source Software, 6(64), 3393. doi:10.21105/joss.03393 https://doi.org/10.21105/joss.03393.
Makowski D, Lüdecke D, Patil I, Thériault R, Ben-Shachar M, Wiernik B (2023). “Automated Results Reporting as a Practical Tool to Improve Reproducibility and Methodological Best Practices Adoption.” CRAN. https://easystats.github.io/report/.
Müller K, Wickham H (2023). tibble: Simple Data Frames. R package version 3.2.1, https://CRAN.R-project.org/package=tibble.
Pebesma E, Bivand R (2023). Spatial Data Science: With applications in R. Chapman and Hall/CRC. doi:10.1201/9780429459016 https://doi.org/10.1201/9780429459016, https://r-spatial.org/book/. Pebesma E (2018). “Simple Features for R: Standardized Support for Spatial Vector Data.” The R Journal, 10(1), 439-446. doi:10.32614/RJ-2018-009 https://doi.org/10.32614/RJ-2018-009, https://doi.org/10.32614/RJ-2018-009.
R Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, https://www.stats.ox.ac.uk/pub/MASS4/.
Viechtbauer W (2010). “Conducting meta-analyses in R with the metafor package.” Journal of Statistical Software, 36(3), 1-48. doi:10.18637/jss.v036.i03 https://doi.org/10.18637/jss.v036.i03.
Wei T, Simko V (2021). R package ‘corrplot’: Visualization of a Correlation Matrix. (Version 0.92), https://github.com/taiyun/corrplot.
White T, Noble D, Senior A, Hamilton W, Viechtbauer W (2022). metadat: Meta-Analysis Datasets. R package version 1.2-0, https://CRAN.R-project.org/package=metadat.
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
Wickham H (2023). forcats: Tools for Working with Categorical Variables (Factors). R package version 1.0.0, https://CRAN.R-project.org/package=forcats.
Wickham H (2023). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.5.1, https://CRAN.R-project.org/package=stringr.
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.
Wickham H, Bryan J (2023). readxl: Read Excel Files. R package version 1.4.3, https://CRAN.R-project.org/package=readxl.
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
Wickham H, Henry L (2023). purrr: Functional Programming Tools. R package version 1.0.2, https://CRAN.R-project.org/package=purrr.
Wickham H, Hester J, Bryan J (2024). readr: Read Rectangular Text Data. R package version 2.1.5, https://CRAN.R-project.org/package=readr.
Wickham H, Vaughan D, Girlich M (2024). tidyr: Tidy Messy Data. R package version 1.3.1, https://CRAN.R-project.org/package=tidyr.
Xie Y, Cheng J, Tan X (2024). DT: A Wrapper of the JavaScript Library ‘DataTables’. R package version 0.32, https://CRAN.R-project.org/package=DT.

Code

# pkgs

Sesion info

Code

print(sessionInfo(), locale = FALSE)
#> R version 4.3.2 (2023-10-31 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19042)
#> 
#> Matrix products: default
#> 
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] report_0.5.8         lme4_1.1-35.3        MASS_7.3-60         
#>  [4] ggeffects_1.6.0      forcats_1.0.0        dplyr_1.1.4         
#>  [7] purrr_1.0.2          readr_2.1.5          tidyr_1.3.1         
#> [10] tibble_3.2.1         ggplot2_3.5.1        tidyverse_2.0.0     
#> [13] performance_0.11.0.9 see_0.8.4.1          geodata_0.5-9       
#> [16] grateful_0.2.4       DT_0.32              corrplot_0.92       
#> [19] sjPlot_2.8.16        terra_1.7-71         visreg_2.7.0        
#> [22] metafor_4.6-0        numDeriv_2016.8-1.1  metadat_1.2-0       
#> [25] Matrix_1.6-1.1       MuMIn_1.47.5         sf_1.0-15           
#> [28] stringr_1.5.1        lubridate_1.9.3      gt_0.10.1           
#> [31] readxl_1.4.3        
#> 
#> loaded via a namespace (and not attached):
#>   [1] RColorBrewer_1.1-3      mathjaxr_1.6-0          rstudioapi_0.16.0      
#>   [4] jsonlite_1.8.8          datawizard_0.10.0.4     magrittr_2.0.3         
#>   [7] TH.data_1.1-2           estimability_1.5        nloptr_2.0.3           
#>  [10] farver_2.1.1            rmarkdown_2.27          vctrs_0.6.5            
#>  [13] minqa_1.2.6             effectsize_0.8.8        htmltools_0.5.7        
#>  [16] haven_2.5.4             cellranger_1.1.0        sjmisc_2.8.10          
#>  [19] sass_0.4.8              pracma_2.4.4            KernSmooth_2.23-22     
#>  [22] bslib_0.6.1             htmlwidgets_1.6.4       sandwich_3.1-0         
#>  [25] cachem_1.0.8            emmeans_1.10.2          zoo_1.8-12             
#>  [28] lifecycle_1.0.4         iterators_1.0.14        pkgconfig_2.0.3        
#>  [31] sjlabelled_1.2.0        R6_2.5.1                fastmap_1.1.1          
#>  [34] digest_0.6.34           colorspace_2.1-0        patchwork_1.2.0        
#>  [37] crosstalk_1.2.1         labeling_0.4.3          fansi_1.0.6            
#>  [40] timechange_0.3.0        mgcv_1.9-1              compiler_4.3.2         
#>  [43] proxy_0.4-27            withr_3.0.0             doParallel_1.0.17      
#>  [46] DBI_1.2.2               sjstats_0.19.0          classInt_0.4-10        
#>  [49] caTools_1.18.2          tools_4.3.2             units_0.8-5            
#>  [52] qqconf_1.3.2            glue_1.7.0              nlme_3.1-163           
#>  [55] grid_4.3.2              generics_0.1.3          qqplotr_0.0.6          
#>  [58] gtable_0.3.4            tzdb_0.4.0              class_7.3-22           
#>  [61] hms_1.1.3               xml2_1.3.6              utf8_1.2.4             
#>  [64] ggrepel_0.9.5           foreach_1.5.2           pillar_1.9.0           
#>  [67] robustbase_0.99-2       splines_4.3.2           lattice_0.22-5         
#>  [70] survival_3.5-7          tidyselect_1.2.1        knitr_1.46             
#>  [73] stats4_4.3.2            xfun_0.44               DEoptimR_1.1-3         
#>  [76] stringi_1.8.3           yaml_2.3.8              boot_1.3-28.1          
#>  [79] evaluate_0.23           codetools_0.2-19        twosamples_2.0.1       
#>  [82] cli_3.6.2               xtable_1.8-4            parameters_0.21.7.1    
#>  [85] pbmcapply_1.5.1         munsell_0.5.0           jquerylib_0.1.4        
#>  [88] Rcpp_1.0.12             coda_0.19-4.1           parallel_4.3.2         
#>  [91] ellipsis_0.3.2          bayestestR_0.13.2.1     opdisDownsampling_1.0.1
#>  [94] bitops_1.0-7            mvtnorm_1.2-4           scales_1.3.0           
#>  [97] e1071_1.7-14            insight_0.19.11.2       rlang_1.1.3            
#> [100] multcomp_1.4-25

Reuse

CC BY-SA 4.0

Citation

BibTeX citation:

@online{lizcano,
  author = {Lizcano, Diego and Velásquez-Tibata, Jorge},
  title = {BFI - {Carbono,} {Parita} {Bay,} {Panama}},
  date = {},
  langid = {en}
}

For attribution, please cite this work as:

Lizcano, Diego, and Jorge Velásquez-Tibata. n.d. “BFI - Carbono, Parita Bay, Panama.”