Parsing utilities are a set of functions that helps generate parsing spec for tf$parse_example
to be used with estimators. If users keep data in tf$Example
format, they need to call tf$parse_example
with a proper feature spec. There are two main things that these utility functions help:
Users need to combine parsing spec of features with labels and weights (if any) since they are all parsed from same tf$Example
instance. The utility functions combine these specs.
It is difficult to map expected label by a estimator such as dnn_classifier
to corresponding tf$parse_example
spec. The utility functions encode it by getting related information from users (key, dtype).
<- classifier_parse_example_spec(
parsing_spec feature_columns = column_numeric('a'),
label_key = 'b',
weight_column = 'c'
)
For the above example, classifier_parse_example_spec
would return the following:
<- list(
expected_spec a = tf$python$ops$parsing_ops$FixedLenFeature(reticulate::tuple(1L), dtype = tf$float32),
c = tf$python$ops$parsing_ops$FixedLenFeature(reticulate::tuple(1L), dtype = tf$float32),
b = tf$python$ops$parsing_ops$FixedLenFeature(reticulate::tuple(1L), dtype = tf$int64)
)
# This should be the same as the one we constructed using `classifier_parse_example_spec`
::expect_equal(parsing_spec, expected_spec) testthat
Firstly, define features transformations and initiailize your classifier similar to the following:
<- feature_columns(...)
fcs
<- dnn_classifier(
model n_classes = 1000,
feature_columns = fcs,
weight_column = 'example-weight',
label_vocabulary= c('photos', 'keep', ...),
hidden_units = c(256, 64, 16)
)
Next, create the parsing configuration for tf$parse_example
using classifier_parse_example_spec
and the feature columns fcs
we have just defined:
<- classifier_parse_example_spec(
parsing_spec feature_columns = fcs,
label_key = 'my-label',
label_dtype = tf$string,
weight_column = 'example-weight'
)
This label configuration tells the classifier the following:
c('photos', 'keep', ...)
Then define your input function with the help of read_batch_features
that reads the batches of features from files in tf$Example
format with the parsing configuration parsing_spec
we just defined:
<- function() {
input_fn_train <- tf$contrib$learn$read_batch_features(
features file_pattern = train_files,
batch_size = batch_size,
features = parsing_spec,
reader = tf$RecordIOReader)
<- features[["my-label"]]
labels return(list(features, labels))
}
Finally we can train the model using the training input function parsed by classifier_parse_example_spec
:
train(model, input_fn = input_fn_train)