4 LandR Biomass_speciesParameters Module

module-version-Badge

Issues-badge

4.0.0.1 Authors:

Ian Eddy [aut, cre], Eliot McIntire [aut], Ceres Barros [ctb]

This documentation is work in progress. Potential discrepancies and omissions may exist for the time being. If you find any, contact us using the “Get help” link above.

4.1 Module Overview

4.1.2 Summary

LandR Biomass_speciesParameters (hereafter Biomass_speciesParameters) calibrates species growth and mortality trait values used in Biomass_core, by matching theoretical species’ growth curves obtained with different trait values (see Simulated species data) against observed growth curves derived from Permanent Sample Plots (PSP data) across Canada (see Permanent sample plot data), to find the combination of trait values that allows a better match to the observed curves.

In particular, it directly calibrates the growthcurve, mortalityshape invariant species traits and two new traits inflationFactor and mANPPproportion, which are used to calibrate previously estimated species maximum biomass (maxB) and maximum aboveground net primary productivity (maxANPP) values (see Parameter estimation/calibration).

This module will not obtain other traits or parameters used in Biomass_core and so must be used in conjunction with another data/calibration module that does so (e.g., Biomass_borealDataPrep).

It can however be used stand-alone in an initial developmental phase for easier inspection of the statistical calibration procedure employed.

As of November 04, 2022, the raw PSP data used in this module is not freely available, and data sharing agreements must be obtained from the governments of SK, AB, and BC to obtain it. However, the processed and anonymized PSP data is provided via a Google Drive folder accessed automatically by the module.

A Google Account is therefore necessary to access the data used for calibration.

If you do not have a Google Account, or cannot access the data, please report an issue by clicking on the “Get help” link above.

4.2 Module manual

4.2.1 General functioning

Tree cohort growth and mortality in Biomass_core are essentially determined by five parameters: the invariant species traits ‘growth curve’ (growthcurve), ‘mortality shape’, (mortalityshape) and longevity, and the spatio-temporally varying traits maximum biomass (maxB) and maximum aboveground net primary productivity (maxANPP).

All five traits strongly modulate the shape of species growth curves and so it is important that they are calibrated to the study area in question.

Also, the growth and mortality equations used in Biomass_core are non-linear and their resulting actual biomass accumulation curve is an emergent phenomenon due to competition effects. This means that the ideal trait/parameter values should not be estimated on pure single species growth conditions, as their resulting dynamics will be different in a multi-species context.

Biomass_speciesParameters attempts to address these issues (at least partially) using a “curve-matching” approach. It compares a GAMM fitted to permanent sample plot (PSP) data to a large collection of theoretical species curves, each representing a different set of growth and mortality parameters. This also provides a means to calibrate these traits using a dataset that is independent from the one used to derive initial landscape conditions and initial values of maxB and maxANPP.

While longevity is adjusted using published values (see Biomass_borealDataPrep manual), the remaining four parameters are calibrated using the PSP data. Hence, Biomass_speciesParameters generally follows other data modules, like Biomass_boreaDataPrep, that prepare other traits such as longevity, maxB and maxANPP.

4.2.1.1 Permanent sample plot data

Biomass_speciesParameters can use all the PSP data available (note that it may span several thousands of kilometres), or select the data based on a polygon (studyAreaANPP; see List of input objects).

The default PSP data were initially obtained from the National Forest Inventory (NFI), the Alberta Ministry of Agriculture, the Saskatchewan Ministry of the Environment, and the British Columbia Ministry of Forests, treated for errors and standardized into a single data set with the exact location and identifying attributes anonymized. We only share the randomized and anonymized dataset, as data sharing agreements must be met to access the raw data.

The data include individual species, diameter at breast height (DBH), and sometimes tree height measurements for each tree in a plot, as well as stand age. As part of the standardization process, dead trees were removed from the dataset. Tree biomass was then estimated by tree species, in \(g/m^2\), using either the DBH-only model or a DBH-height model from either Lambert et al. (2005) or Ung et al. (2008) (see P(sim)$biomassModel module parameter in list of parameters).

4.2.1.2 Simulated species data

The Biomass_speciesFactorial module was used to create a library of theoretical species curves (biomass accumulation curves, to be more precise) to which the empirical species curves derived from PSP-biomass are matched for each species trait combination in the study area. The library of curves was created by running several Biomass_core simulations with no reproduction, competition, disturbance, or dispersal effects, on the study area. Each simulation differed in the combination of species trait values that influence growth and mortality dynamics, namely: growthcurve, mortalityshape, longevity, maxANPP and maximum biomass (maxBiomass, not to be confused with the data-driven maxB which is later calibrated).

The values for maxANPP were explored via the mANPPproportion, the ratio of maxANPP to maxBiomass (the parameter used for theroetical curves), as it reflects their relationship.

growthcurve values varied from 0 to 1, in increments of 0.1; mortalityshape varied from 5 to 25, in increments of 1; longevity varied from 150 to 700 in increments of 25; mANPPproportion varied from 0.25 to 10 in increments of 0.25. maxBiomass was held constant at 5000.

This resulted in over 64,000,000 theoretical curves.

Results from these simulations were compiled into a table (cohortDataFactorial ; see List of input objects) that is accessed by Biomass_speciesParameters, so that the module can be run without needing to re-simulate the theoretical curves.

4.2.1.3 Parameter estimation/calibration

Biomass_speciesParameters calibrates growthcurve, mortalityshape and mANPPproportion by matching the theoretical species curves produced by Biomass_speciesFactorial (cohortDataFactorial object) against observed species growth curves from permanent sample plot (PSP) data.

Before fitting the observed species growth curves, the module subsets the PSP data to stand ages below the 95th percent quantile for all species (this can be changed via the P(sim)$quantileAgeSubset module parameter), as records for larger age classes were limited and constituted statistical outliers. In some species, changing the quantile value may improve results, however. Two examples are Pinus banksiana and Populus sp, for which using the 99th percent quantile improved the models, because these are short-lived species for which data at advanced ages is scarce.

The module attempts to fit the models using stands where the focal species is dominant (but not monocultures), while balancing sample size (see biomass weighting below). Hence, for a given species, it only includes plots where the species’ relative biomass is at least 50%. This is, when calibrating Populus tremuloides traits, PSP daa plots are only included if 50% of the stand biomass is composed of P. tremuloides.

In addition, 50 points are added at the origin (age = 0 and biomass = 0) to force the intercept to be essentially 0 age and 0 biomass.

Observed growth curves for each species are then fit using generalized additive mixed models (GAMMs) that relate species biomass (\(B\)) with stand age (\(standAge\)), accounting for the random effects of the measurement year (\(measureYear\)) and plot (\(plotID\)) on the intercept:

\[\begin{equation} B \sim f_{1}(standAge) + (\sim 1 | measureYear + plotID) \tag{4.1} \end{equation}\]

where \(f_{1}\) denotes the smoother function. To avoid overfitting, the module constrains the smoother on stand age to a maximum smoothing degree of 3 (i.e. 3 knots and a polynomial degree of 2) and a default point constraint at 0 that attempts to force the intercept to 0. The smoother degree constraint, however, can be changed via the P(sim)$GAMMknots module parameter.

4.2.1.3.1 Biomass-weighting

In addition, \(B\) is weighted with respect to species dominance. This consisted in 1) calculating the average biomass of each dominant species (i.e. relative biomass in a plot > 0.5; \(domSpeciesB_{1}\)), in each plot and measurement year, and 2) dividing the species average biomass by the average biomass across all n dominant species (\(allDomSpeciesB\)):

\[\begin{equation} \frac{\overline{\rm domSpeciesB_{1}}}{\overline{\rm allDomSpeciesB}} \tag{4.2} \end{equation}\]

For the added 0 age and 0 biomass data the module uses weights equal to 1.

It is possible that some selected species do not have enough data to allow for model convergence. In this case, Biomass_speciesParameters skips trait (re-)calibration, and values remain unchanged.

After fitting each species GAMM, Biomass_speciesParameters compares it to the theoretical curves obtained with a longevity value that matches the focal species’ longevity, and picks the best one based on maximum likelihood. This best theoretical curve will be associated with a given combination of growthcurve, mortalityshape and mANPPproportion values, which are then used directly as the calibrated values, in case of growthcurve and mortalityshape, or to calibrate maxANPP in the case of mANPPproportion (see below).

The user has the option to constrain the values of the growthcurve and mortalityshape parameters. By default, growthcurve is forced to 0.5, mortalityshape is allowed to vary between 15 and 25, and mANPPproportion between 2.0 and 5.0 (see module parameters P(sim)$constrainGrowthCurve, P(sim)constrainMortalityShape and P(sim)constrainMaxANPP). These boundary values were based on preliminary runs and analyses using the default data and may not apply to other data sets, or to different spatial subsets of the default data.

If boundary values are used, Biomass_speciesParameters subsets the theoretical species growth curves to those with trait values within the selected boundaries.

Since simulated growth curves never achieve the maximum biomass parameter (the maxBiomass parameter set to 5000 for all simulations of theoretical species curves, or the maxB parameter in Biomass_core simulations), it acts as an asymptotic limit that reflects the potential maximum biomass for a species in an ecolocation (ecological zone and land cover combination).

Biomass_speciesParameters uses the ratio between the potential maximum biomass (maxBiomass, always 5000) to the achieved maximum biomass in the theoretical curves, to rescale maxB. This ratio is called the inflationFactor and it is multiplied by maxB values previously estimated from data (e.g. by Biomass_borealDataPrep). This way, species simulated in Biomass_core are able to achieve the maximum observed biomasses used to initially estimate maxB.

Finally, the module calibrates maxANPP using the mANPPproportion value from the best matching theoretical growth curve as:

\[\begin{equation} maxB \times \frac{mANPPproportion}{100} \tag{4.3} \end{equation}\]

where maxB is the already (re-)calibrated version. As already stated above, the final maxANPP value is then constrained between 2.0 and 5.0 by default.

In cases where insufficient PSP data prevent fitting the GAMMs and performing the calibration, mANPPproportion defaults to 3.33 (the value used in LANDIS-II applications in Canada’s boreal forests) and the inflationFactor to 1.

4.2.2 List of input objects

The full list of input objects required by the module is presented below (Table 4.1). The only input that must be provided is studyAreaANPP (the study area used extract the PSP data from). All other input objects have internal defaults, but the user may need to request access to their online files.

Of these inputs, the following are particularly important and deserve special attention:

Spatial layers

  • studyAreaANPP – shapefile. A SpatialPolygonsDataFrame with a single polygon determining the where the PSP should be subset to simulation will take place. This input object must be supplied by the user or another module.

Tables

  • factorialSpeciesTable and reducedFactorialCohortData – a tables of species trait combinations and the theoretical species grwoth curve data (respectively).
  • PSPmeasure, PSPplot and PSPgis – tree measurement, biomass growth and geographical data of the PSP datasets used to build observed species growth curves.
  • species – a table of invariant species traits that may have been produced by another module. It must contain the columns ‘species’, ‘growthcurve’ and ‘mortality shape’, whose values will be calibrated.
  • speciesEcoregion – table of spatially-varying species traits that may have been produced by another module. It must contain the columns ‘speciesCode’, ‘maxB’ and ‘maxANPP’ and ‘ecoregionGroup’ (the ecolocation ID). ‘maxB’ and ‘maxANPP’ values are (re-)calibrated by species.
Table 4.1: List of Biomass_speciesParameters input objects and their description.
objectName objectClass desc sourceURL
factorialSpeciesTable data.table Table with species traits to be matched with sim$reducedFactorialCohortData https://drive.google.com/open?id=1q0ou0CBzD9GqGSparpHqf318IWK6ycty
reducedFactorialCohortData data.table Results of factorial species trait simulation. This can be found by running SpeciesFactorial.R but requires a specific commit of Biomass_core https://drive.google.com/open?id=1h8StXE0vm8xyDycRomCkwIaL7wfh5Irj
PSPmeasure data.table Merged PSP and TSP individual tree measurements. Must include the following columns: ‘MeasureID’, ‘OrigPlotID1’, ‘MeasureYear’, ‘TreeNumber’, ‘Species’, ‘DBH’ and ‘newSpeciesName’ the latter corresponding to species names in LandR::sppEquivalencies_CA$PSP. Defaults to randomized PSP data stripped of real plotIDs https://drive.google.com/file/d/1LmOaEtCZ6EBeIlAm6ttfLqBqQnQu4Ca7/view?usp=sharing
PSPplot data.table Merged PSP and TSP plot data. Defaults to randomized PSP data stripped of real plotIDs. Must contain columns ‘MeasureID’, ‘MeasureYear’, ‘OrigPlotID1’, and ‘baseSA’, the latter being stand age at year of first measurement https://drive.google.com/file/d/1LmOaEtCZ6EBeIlAm6ttfLqBqQnQu4Ca7/view?usp=sharing
PSPgis sf Plot location sf object. Defaults to PSP data stripped of real plotIDs/location. Must include field ‘OrigPlotID1’ for joining to PSPplot object https://drive.google.com/file/d/1LmOaEtCZ6EBeIlAm6ttfLqBqQnQu4Ca7/view?usp=sharing
species data.table A table of invariant species traits with the following trait colums: ‘species’, ‘Area’, ‘longevity’, ‘sexualmature’, ‘shadetolerance’, ‘firetolerance’, ‘seeddistance_eff’, ‘seeddistance_max’, ‘resproutprob’, ‘mortalityshape’, ‘growthcurve’, ‘resproutage_min’, ‘resproutage_max’, ‘postfireregen’, ‘wooddecayrate’, ‘leaflongevity’ ‘leafLignin’, ‘hardsoft’. Only ‘growthcurve’ and ‘mortalityshape’ are used in this module. Default is from Dominic Cyr and Yan Boulanger’s applications of LANDIS-II https://raw.githubusercontent.com/dcyr/LANDIS-II_IA_generalUseFiles/master/speciesTraits.csv
speciesEcoregion data.table Table of spatially-varying species traits (‘maxB’, ‘maxANPP’, ‘establishprob’), defined by species and ‘ecoregionGroup’) Defaults to a dummy table based on dummy data os biomass, age, ecoregion and land cover class NA
sppEquiv data.table Table of species equivalencies. See ?LandR::sppEquivalencies_CA. NA
studyAreaANPP SpatialPolygonsDataFrame Study area used to crop PSP data before building growth curves NA

4.2.3 List of parameters

The full list of parameters used by the module is presented below (Table 4.2), all of which have default values specified in the module’s metadata.

Of these parameters, the following are particularly important:

Calibration parameters

  • GAMMiterations and GAMMknots – control the number of iterations and smoother degree used to fit the GAMMs, respectively.

  • constrainGrowthCurve, constrainMortalityShape and constrainMaxANPP – determine the upper and lower boundaries of the calibrated values of growthcurve, mortalityshape and maxANPP, respectively.

Data processing

  • minimumPlotsPerGamm – define a minimum number of PSP plots needed to fit the GAMMs.

  • PSPperiod – PSP data period to use.

  • quantileAgeSubset – upper quantile age value used to subset PSP data.

Table 4.2: List of Biomass_speciesParameters parameters and their description.
paramName paramClass default min max paramDesc
biomassModel character Lambert2005 NA NA The model used to calculate biomass from DBH. Can be either ‘Lambert2005’ or ‘Ung2008’
constrainGrowthCurve numeric 0.5, 0.5 0 1 Upper and lower bounds on range of potential growth curves when fitting traits. This module accepts a list of vectors, with names equal to P(sim)$sppEquivCol, so that traits are customizable
constrainMortalityShape numeric 15, 25 5 25 Upper and lower bounds on mortality shape when fitting traits. Low mortality curve values lead to numerous cohorts with very little biomass as longevity is approached, adding computation strain. Alternatively accepts a list of vectors, with names equal to those in sim$sppEquiv[, P(sim)$sppEquivCol]
constrainMaxANPP numeric 2, 5 1 10 Upper and lower bounds on maxANPP when fitting traits. When cohorts are initiated with B = maxANPP (see Biomass_core parameter P(sim)$initialB), this can lead to unreasonably initial biomass if maxANPP is also high. Both maxANPP and growthcurve parameters control when maxB is reached. High maxANPP results in earlier peaks. Alternatively accepts a list of vectors, with names equal to those in sim$sppEquiv[, P(sim)$sppEquivCol]
GAMMiterations numeric 8 1 NA Number of iterations for GAMMs. Accepts a list of vectors, with names equal to those in sim$sppEquiv[, P(sim)$sppEquivCol], so that GAMMS are customizable per species
GAMMknots numeric 3 NA NA The number of knots to use in the GAMM. Either 3 or 4 is recommended. Accepts a list of vectors, with names equal to those in sim$sppEquiv[, P(sim)$sppEquivCol], so that GAMMS are customizable per species
minDBH integer 0 0 NA Minimum diameter at breast height (DBH) in cm used to filter PSP data. Defaults to 0cm, i.e. all tree measurements are used.
minimumPlotsPerGamm numeric 50 10 NA Minimum number of PSP plots before building GAMM
PSPperiod numeric 1920, 2019 NA NA The years by which to subset sample plot data, if desired. Must be a vector of length 2
quantileAgeSubset numeric 95 1 100 Quantile by which to subset PSP data. As older stands are sparsely represented, the oldest measurements become vastly more influential. This parameter accepts both a single value and a list of vectors named by sim$sppEquiv[, P(sim)$sppEquivCol]. The PSP stand ages are found in sim$speciesGAMMs$SPECIES$originalData, where SPECIES is the species ID
sppEquivCol character default NA NA The column in sim$sppEquiv data.table to group species by. This parameter should share the same name as in Biomass_borealDataPrep . PSPs are aggregated by names in the PSP column and traits estimated for species with corresponding names in the sim$sppEquiv[, P(sim)$sppEquivCol]
useHeight logical FALSE NA NA Should height be used to calculate biomass (in addition to DBH)? We advise against including height unless you are certain it is present in every PSP
.plotInitialTime numeric NA NA NA This describes the simulation time at which the first plot event should occur
.plotInterval numeric NA NA NA This describes the simulation time interval between plot events
.saveInitialTime numeric NA NA NA This describes the simulation time at which the first save event should occur
.saveInterval numeric NA NA NA This describes the simulation time interval between save events
.useCache logical FALSE NA NA Should this entire module be run with caching activated? This is generally intended for data-type modules, where stochasticity and time are not relevant

4.2.4 List of outputs

The module produces the following outputs (Table 4.3). Note that species and speciesEcoregion are modified versions of the inputed objects with the same name.

Tables

  • species and speciesEcoregion – tables with calibrated trait values.

  • speciesGAMMs – the fitted GAMM model objects for each species.

Table 4.3: List of Biomass_speciesParameters output objects and their description.
objectName objectClass desc
species data.table The updated invariant species traits table (see description for this object in inputs)
speciesEcoregion data.table The updated spatially-varying species traits table (see description for this object in inputs)
speciesGAMMs list A list of mixed-effect general additive models (GAMMs) for each tree species modeling biomass as a function of age

4.2.5 Simulation flow and module events

Biomass_speciesParameters initialies itself and prepares all inputs provided there is an active internet connection and the user has access to the data (and a Google Account to do so).

We advise future users to run Biomass_speciesParameters with defaults and inspect what the objects are like before supplying their own data. The user does not need to run Biomass_speciesFactorial to generate their own theoretical curves (unless they wish to), as the module accesses a pre-generated on-line library with these simulated data.

Note that this module only runs once (in one “time step”) and only executes one event (init). The general flow of Biomass_speciesParameters processes is:

  1. Preparation of all necessary data and input objects that do not require parameter fitting (e.g., the theoretical species growth curve data);

  2. Sub-setting PSP data and calculating the observed species growth curves using GAMMs;

  3. Finding the theoretical species growth curve that best matches the observed curve, for each species. Theoretical curves are subset to those with longevity matching the species’ longevity (in species table) and with growthcurve and mortalityshape values within the chosen boundaries (P(sim)$constrainGrowthCurve, P(sim)$constrainMortalityShape);

  4. Calibrating maxB and maxANPP

  5. Adjusting maxANPP to match the chosen boundaries (P(sim)$constrainMaxANPP)

4.3 Usage example

This module can be run stand-alone, but it won’t do much more than calibrate species trait values based on dummy input trait values. We provide an example of this below, since it may be of value to run the module by itself to become acquainted with the calibration process and explore the fitted GAMMs. However, we remind that to run this example you will need a Google Account, and access to the data may need to be granted.

A realistic usage example of this module and a few others can be found in this repository and in Barros et al. (in review).

4.3.1 Load SpaDES and other packages.

4.3.2 Set up R libraries

options(repos = c(CRAN = "https://cloud.r-project.org"))
tempDir <- tempdir()

pkgPath <- file.path(tempDir, "packages", version$platform, paste0(version$major,
    ".", strsplit(version$minor, "[.]")[[1]][1]))
dir.create(pkgPath, recursive = TRUE)
.libPaths(pkgPath, include.site = FALSE)

if (!require(Require, lib.loc = pkgPath)) {
    remotes::install_github(paste0("PredictiveEcology/", "Require@5c44205bf407f613f53546be652a438ef1248147"),
        upgrade = FALSE, force = TRUE)
    library(Require, lib.loc = pkgPath)
}

setLinuxBinaryRepo()

4.3.3 Get the module and module dependencies

Require(pasteo("PredictiveEcology/", "SpaDES.project@6d7de6ee12fc967c7c60de44f1aa3b04e6eeb5db"),
    require = FALSE, upgrade = FALSE, standAlone = TRUE)

paths <- list(inputPath = normPath(file.path(tempDir, "inputs")),
    cachePath = normPath(file.path(tempDir, "cache")), modulePath = normPath(file.path(tempDir,
        "modules")), outputPath = normPath(file.path(tempDir,
        "outputs")))

SpaDES.project::getModule(modulePath = paths$modulePath, c("PredictiveEcology/Biomass_speciesParameters@master"),
    overwrite = TRUE)

## make sure all necessary packages are installed:
outs <- SpaDES.project::packagesInModules(modulePath = paths$modulePath)
Require(c(unname(unlist(outs)), "SpaDES"), require = FALSE, standAlone = TRUE)

## load necessary packages
Require(c("SpaDES"), upgrade = FALSE, install = FALSE)

4.3.4 Setup simulation

times <- list(start = 0, end = 1)

modules <- list("Biomass_speciesParameters")

# the purpose of this table is experiment with modify longevity -
# longevity is not estimated by the module but it is used in trait
# estimation.
inputSpecies <- data.table(species = c("Abie_bal", "Abie_las", "Betu_pap",
    "Lari_lar", "Pice_eng", "Pice_gla", "Pice_mar", "Pinu_ban", "Pinu_con",
    "Pseu_men", "Popu_tre"), longevity = c(300, 300, 170, 170, 330, 250,
    250, 175, 300, 600, 200), mortalityshape = 15, growthcurve = 0)
objects <- list(species = inputSpecies)

inputs <- list()
outputs <- list()

parameters <- list(Biomass_speciesParameters = list(GAMMiterations = 2,
    GAMMknots = list(Abie_bal = 3, Abie_las = 3, Betu_pap = 3, Lari_lar = 4,
        Pice_eng = 4, Pice_gla = 3, Pice_mar = 4, Pinu_ban = 3, Pinu_con = 4,
        Popu_tre = 4, Pseu_men = 3), minimumPlotsPerGamm = 40, constrainMortalityShape = list(Abie_bal = c(15,
        25), Abie_las = c(15, 25), Betu_pap = c(15, 20), Lari_lar = c(20,
        25), Pice_eng = c(20, 25), Pice_gla = c(20, 25), Pice_mar = c(15,
        25), Pinu_ban = c(15, 25), Pinu_con = c(15, 25), Popu_tre = c(20,
        25), Pseu_men = c(20, 25)), constrainGrowthCurve = list(Abie_bal = c(0,
        1), Abie_las = c(0, 1), Betu_pap = c(0, 1), Lari_lar = c(0, 1),
        Pice_eng = c(0, 1), Pice_gla = c(0, 1), Pice_mar = c(0, 1), Pinu_ban = c(0,
            1), Pinu_con = c(0, 1), Popu_tre = c(0, 1), Pseu_men = c(0,
            1)), quantileAgeSubset = list(Abie_bal = 95, Abie_las = 95,
        Betu_pap = 95, Lari_lar = 95, Pice_eng = 95, Pice_gla = 95, Pice_mar = 95,
        Pinu_ban = 95, Pinu_con = 99, Popu_tre = 99, Pseu_men = 99)))


mySim <- simInitAndSpades(times = times, params = parameters, modules = modules,
    paths = paths, objects = objects)

## to inspect the fitted GAMM models:
mySim$speciesGAMMs$Pice_mar

4.4 References