Binary Features
Binary Features Preprocessing¶
Binary features are directly transformed into a binary valued vector of length n (where n is the size of the dataset) and added to the HDF5 with a key that reflects the name of column in the dataset.
No additional information about them is available in the JSON metadata file.
The parameters available for preprocessing are
missing_value_strategy(defaultfill_with_const): what strategy to follow when there's a missing value in a binary column. The value should be one offill_with_const(replaces the missing value with a specific value specified with thefill_valueparameter),fill_with_mode(replaces the missing values with the most frequent value in the column),fill_with_mean(replaces the missing values with the mean of the values in the column),backfill(replaces the missing values with the next valid value).fill_value(default0): the value to replace the missing values with in case themissing_value_strategyisfill_with_const.
Binary Input Features and Encoders¶
Binary features have two encoders.
One encoder (passthrough') takes the raw binary values coming from the input placeholders are just returned as outputs.
Inputs are of size b while outputs are of size b x 1 where b is the batch size.
The other encoder ('dense') passes the raw binary values through a fully connected layers.
In this case the inputs of size b are transformed to size b x h.
Example binary feature entry in the input features list:
name: binary_column_name
type: binary
encoder: passthrough
Binary input feature parameters are
encoder(default'passthrough') encodes the binary feature. Valid choices:'passthrough': binary feature is passed through as-is,'dense': binary feature is fed through a fully connected layer.
There are no additional parameters for the passthrough encoder.
Dense Encoder Parameters¶
For the dense encoder these are the available parameters.
num_layers(default1): this is the number of stacked fully connected layers that the input to the feature passes through. Their output is projected in the feature's output space.fc_size(default256): f afc_sizeis not already specified infc_layersthis is the defaultfc_sizethat will be used for each layer. It indicates the size of the output of a fully connected layer.use_bias(defaulttrue): boolean, whether the layer uses a bias vector.weights_initializer(default'glorot_uniform'): initializer for the weights matrix. Options are:constant,identity,zeros,ones,orthogonal,normal,uniform,truncated_normal,variance_scaling,glorot_normal,glorot_uniform,xavier_normal,xavier_uniform,he_normal,he_uniform,lecun_normal,lecun_uniform. Alternatively it is possible to specify a dictionary with a keytypethat identifies the type of initializer and other keys for its parameters, e.g.{type: normal, mean: 0, stddev: 0}. To know the parameters of each initializer, please refer to TensorFlow's documentation.bias_initializer(default'zeros'): initializer for the bias vector. Options are:constant,identity,zeros,ones,orthogonal,normal,uniform,truncated_normal,variance_scaling,glorot_normal,glorot_uniform,xavier_normal,xavier_uniform,he_normal,he_uniform,lecun_normal,lecun_uniform. Alternatively it is possible to specify a dictionary with a keytypethat identifies the type of initializer and other keys for its parameters, e.g.{type: normal, mean: 0, stddev: 0}. To know the parameters of each initializer, please refer to TensorFlow's documentation.weights_regularizer(defaultnull): regularizer function applied to the weights matrix. Valid values arel1,l2orl1_l2.bias_regularizer(defaultnull): regularizer function applied to the bias vector. Valid values arel1,l2orl1_l2.activity_regularizer(defaultnull): regurlizer function applied to the output of the layer. Valid values arel1,l2orl1_l2.norm(defaultnull): if anormis not already specified infc_layersthis is the defaultnormthat will be used for each layer. It indicates the norm of the output and it can benull,batchorlayer.norm_params(defaultnull): parameters used ifnormis eitherbatchorlayer. For information on parameters used withbatchsee Tensorflow's documentation on batch normalization or forlayersee Tensorflow's documentation on layer normalization.activation(defaultrelu): if anactivationis not already specified infc_layersthis is the defaultactivationthat will be used for each layer. It indicates the activation function applied to the output.dropout(default0): dropout rate
Binary Output Features and Decoders¶
Binary features can be used when a binary classification needs to be performed or when the output is a single probability. There is only one decoder available for binary features and it is a (potentially empty) stack of fully connected layers, followed by a projection into a single number followed by a sigmoid function.
These are the available parameters of a binary output feature
reduce_input(defaultsum): defines how to reduce an input that is not a vector, but a matrix or a higher order tensor, on the first dimension (second if you count the batch dimension). Available values are:sum,meanoravg,max,concat(concatenates along the first dimension),last(returns the last vector of the first dimension).dependencies(default[]): the output features this one is dependent on. For a detailed explanation refer to Output Features Dependencies.reduce_dependencies(defaultsum): defines how to reduce the output of a dependent feature that is not a vector, but a matrix or a higher order tensor, on the first dimension (second if you count the batch dimension). Available values are:sum,meanoravg,max,concat(concatenates along the first dimension),last(returns the last vector of the first dimension).loss(default{type: cross_entropy, confidence_penalty: 0, robust_lambda: 0, positive_class_weight: 1}): is a dictionary containing a losstypeand its hyper-parameters. The only available losstypeiscross_entropy(cross entropy), and the optional parameters areconfidence_penalty(an additional term that penalizes too confident predictions by adding aa * (max_entropy - entropy) / max_entropyterm to the loss, where a is the value of this parameter),robust_lambda(replaces the loss with(1 - robust_lambda) * loss + robust_lambda / 2which is useful in case of noisy labels) andpositive_class_weight(multiplies the loss for the positive class, increasing its importance).
These are the available parameters of a binary output feature decoder
fc_layers(defaultnull): it is a list of dictionaries containing the parameters of all the fully connected layers. The length of the list determines the number of stacked fully connected layers and the content of each dictionary determines the parameters for a specific layer. The available parameters for each layer are:fc_size,norm,activation,dropout,initializerandregularize. If any of those values is missing from the dictionary, the default one specified as a parameter of the decoder will be used instead.num_fc_layers(default 0): this is the number of stacked fully connected layers that the input to the feature passes through. Their output is projected in the feature's output space.fc_size(default256): if afc_sizeis not already specified infc_layersthis is the defaultfc_sizethat will be used for each layer. It indicates the size of the output of a fully connected layer.activation(defaultrelu): if anactivationis not already specified infc_layersthis is the defaultactivationthat will be used for each layer. It indicates the activation function applied to the output.norm(defaultnull): if anormis not already specified infc_layersthis is the defaultnormthat will be used for each layer. It indicates the norm of the output and it can benull,batchorlayer.norm_params(defaultnull): parameters used ifnormis eitherbatchorlayer. For information on parameters used withbatchsee Tensorflow's documentation on batch normalization or forlayersee Tensorflow's documentation on layer normalization.dropout(default0): dropout rateuse_base(defaulttrue): boolean, whether the layer uses a bias vector.weights_initializer(default'glorot_uniform'): initializer for the weights matrix. Options are:constant,identity,zeros,ones,orthogonal,normal,uniform,truncated_normal,variance_scaling,glorot_normal,glorot_uniform,xavier_normal,xavier_uniform,he_normal,he_uniform,lecun_normal,lecun_uniform. Alternatively it is possible to specify a dictionary with a keytypethat identifies the type of initializer and other keys for its parameters, e.g.{type: normal, mean: 0, stddev: 0}. To know the parameters of each initializer, please refer to TensorFlow's documentation.bias_initializer(default'zeros'): initializer for the bias vector. Options are:constant,identity,zeros,ones,orthogonal,normal,uniform,truncated_normal,variance_scaling,glorot_normal,glorot_uniform,xavier_normal,xavier_uniform,he_normal,he_uniform,lecun_normal,lecun_uniform. Alternatively it is possible to specify a dictionary with a keytypethat identifies the type of initializer and other keys for its parameters, e.g.{type: normal, mean: 0, stddev: 0}. To know the parameters of each initializer, please refer to TensorFlow's documentation.weights_regularizer(defaultnull): regularizer function applied to the weights matrix. Valid values arel1,l2orl1_l2.bias_regularizer(defaultnull): regularizer function applied to the bias vector. Valid values arel1,l2orl1_l2.activity_regularizer(defaultnull): regurlizer function applied to the output of the layer. Valid values arel1,l2orl1_l2.threshold(default0.5): The threshold above (greater or equal) which the predicted output of the sigmoid will be mapped to 1.
Example binary feature entry (with default parameters) in the output features list:
name: binary_column_name
type: binary
reduce_input: sum
dependencies: []
reduce_dependencies: sum
loss:
type: cross_entropy
confidence_penalty: 0
robust_lambda: 0
positive_class_weight: 1
fc_layers: null
num_fc_layers: 0
fc_size: 256
activation: relu
norm: null
dropout: 0.2
weisghts_intializer: glorot_uniform
bias_initializer: zeros
weights_regularizer: l1
bias_regularizer: l1
threshold: 0.5
Binary Features Measures¶
The only measures that are calculated every epoch and are available for binary features are the accuracy and the loss itself.
You can set either of them as validation_measure in the training section of the configuration if you set the validation_field to be the name of a binary feature.