A simple R package to derive flag for aggregates
> devtools::install_github("eurostat/flagr")
A flag is an attribute of a cell in a data set that provides additional qualitative information about the statistical value of that cell. They can indicate, for example, that a given value is estimated, confidential or represents a break in the time series.
Currently different sets of flags are in use in the European Statistical System (ESS). Some domains uses the SDMX code list for observation status and confidentiality status. Eurostat uses a simplified list of flags for dissemination, and other domains applies different sets of flags defined in regulations or in other agreements.
In most cases it is well defined how the flag shall be assigned to the individual values, but it is not straightforward what flag shall be propagated to an aggregated value like sum, average, quintiles, etc. For this reason this package (flagr) was created to help users assign a flag to the aggregate based on the underlying flags and values.
The package contains a fictive test data set(test_data
),
a wrapping function (propagate_flag
) calling the different
methods and 3 methods (flag_hierarchy
,
flag_frequency
and flag_weighted
) to derive
flags for aggregates.
flag_hierarchy
method returns the flag which listed
first in a given set of ordered flags,flag_frequency
method returns the most frequent
flag for the aggregate,flag_weighted
method returns the flag which
cumulative weight is the highest.Detailed documentation of the functions is in the package or see the vignette for more information.
> library(tidyr)
> flags <- spread(test_data[, c(1:3)], key = time, value = flags)
>
> \#hierarchy method
> propagate_flag(flags[, c(2:ncol(flags))],"hierarchy","puebscd")
> propagate_flag(flags[, c(2:ncol(flags))],"hierarchy",c("b","c","d","e","p","s","u"))
>
> \#frequency method
> propagate_flag(flags[, c(2:ncol(flags))],"frequency")
>
> \#weighted method
> flags<-flags[, c(2:ncol(flags))]
> weights <- spread(test_data[, c(1, 3:4)], key = time, value = values)
> weights<-weights[, c(2:ncol(weights))]
>
> propagate_flag(flags,"weighted",flag_weights=weights,threshold=0.1)