This vignette provides the technical details underlying the
proto_ipm
data structure. Most users will probably not care
about much, if any, of the information in this vignette, and can skip
it. It is primarily aimed at those interested in extending the data
structure for their own purposes, or those who are just curious how this
all works.
The primary motivation in defining this data structure was to provide
a sort of middle ground between user-defined models, and ones that are
stored in the PADRINO database. Defining this means we can use
the exact same machinery to actually generate the kernels once we’ve
defined the proto_ipm
itself. This considerably reduces
code duplication across tasks. Additionally, the hope is that
ipmr
becomes widely used, and authors can just email us a
proto_ipm
and a few other bits of metadata (or even just
put in the supplementary data of their manuscripts), and we can stick it
into the database. This last point may well be a pipe dream, but we all
need something to hope for, right?
The proto_ipm
data structure powers most of
ipmr
. It represents the minimum amount of information one
needs to specify to implement an IPM, and provides a (hopefully)
standardized data structure going forward. It has the following classes:
model_class
(defined in init_ipm()
),
proto_ipm
, and data.frame
. It will always have
as many rows as calls to define_kernel()
that are used in
the model definition pipeline. Each row corresponds to 1 or more
kernels.
NB: A row may correspond to more than 1 kernel when discrete effects
like age, year, or site are included in the model. When this happens,
expand.grid()
is called on the list in
par_set_indices
, and all rows are paste
’d
together with collapse = "_"
. This creates fully crossed
levels. Each row that contains the grouping effects is then replicated,
but with a single level substituted in for the suffix(es). These steps
are implemented within make_ipm()
(and, more specifically
.initialize_kernels()
), and so you will never actually see
the results of the suffix expansion when you print
proto_ipm
, only the output from
make_ipm()
.
The proto_ipm
will also always contain the following
columns:
id
: This is a model ID, and not especially useful
for user-defined models. It will always be the same across all rows for
a single model. It is included to assist with implementing
PADRINO, a database of Integral Projection Models.
kernel_id
: This is the name
of each
kernel from define_kernel()
and is a character string.
make_ipm()
creates an object with this name from the
parameters and functions in the params
column.
domain
: This is a list column and contains the
information that defines the domain name, beginning state, and ending
state of kernel. These are either numeric vectors of length 3, with
smallest value, largest value, and number of values (for continuous
state variables), or NA_real_
(for discrete state
variables).
state_var
: a list column. This contains the names of
the state variables that the kernel operates on.
int_rule
: A character string. This is the name of
the integration rule for the kernel. In the case of a discrete to
discrete transition, this is just NA_character_
.
evict
: Either TRUE
or
FALSE
. Denotes whether the kernel will have an eviction
correction applied.
evict_fun
: A list column. If evict
is
TRUE
, each entry will contain one or more quosures.
These should be calls that modify one or more vital rate expressions to
generate the kernel itself.
pop_state
: A list column. Contains either a quosure
that generates the initial population state vector(s), or pre-evaluated
vectors if passed into the pop_vectors
slot of
define_pop_state()
.
env_state
: A list column. This contains quosure(s)
that sample a continuous environmental variable, and the data that are
needed to evaluate the quosures. If the quosure contains a user-defined
function, then the data should also contain a variable pointing to that
function’s definition (usually in the global environment).
uses_par_sets
: Either TRUE
or
FALSE
. This indicates whether to check all of the names in
the model for grouping effects. If TRUE
,
.split_par_sets
is called inside of
make_ipm
.
par_set_indices
: A list. If
uses_par_sets
is TRUE
, then this will contain
a list of integer or character vectors. The names in the list should
correspond to the suffixes used in vital rate expressions/variable
names, and the values in the list should correspond to the values the
suffix can take on
(e.g. list(site = c("A", "B", "C"), yr = 1:5)
).
uses_age
: Either TRUE
or
FALSE
. Indicates whether the model has age
structure.
age_indices
: A list with the age range for the
model. Can contain 1 or 2 components: age
and, optionally,
max_age
. age
should be an integer vector, and
max_age
should be single integer denoting the maximum age
in the model.
params
: A list. This will always have 4 entries.
formula
: This is the kernel’s formula as a text
string.
family
: This is the type of transition the kernel
describes (e.g. continuous -> continuous, continuous -> discrete,
etc.). It can be one of: c("CC", "CD", "DC", "DD")
. The
first letter indicates what type of state the kernel starts on, and the
second what type it ends on.
vr_text
: This is a named list of text strings that
represent vital rates. The name of each entry is the name of the vital
rate. Each text string gets processed for grouping effects and possibly
age, and then parsed prior to evaluation. The conversion from a call to
text back to a call lets us implement ipmr
’s suffix syntax.
When the text strings are converted to expressions and evaluated, the
values these expressions generate are bound to the names of the
list.
params
: This contains a named list of all constants
used in the kernel’s vital rates. These are usually regression
coefficients, regression models themselves, or parameters derived from
other data sources (e.g. germination rates from a seed sowing
experiment). For coefficients and raw parameter values, each name in the
list should only correspond to a single value. In addition, all values
in this list will have a "flat_protect"
attribute appended
to them. See below for more details.
usr_funs
: A list. If the user has specified their
own functions in the call to make_ipm
, they will get stored
here.
flat_protect
attribute.flatten_to_depth
is an internal function that takes a
nested list and “flattens” it to the depth specified in the 2nd
argument. Names are preserved at the most nested level of the list. This
function is used extensively by functions in ipmr
to ensure
that binding things to evaluation environments goes smoothly.
The flat_protect
attribute controls whether
.flatten_to_depth
“flattens” a particular list element or
leaves it untouched. This is necessary to prevent it from recursively
flattening regression model objects, which themselves are usually lists.
predict
methods require an intact list though, so squashing
them down and stripping their classes/other attributes prevents those
from working. In general, values in the data_list
should
not have this attribute set to TRUE
unless they are
regression models passed to predict
methods in the vital
rate expressions. ipmr
sets this attribute internally using
.protect_model
in define_kernel()
. Adding new
model classes requires updating the output from
.supported_models
.