↑ H3 Features
H3 is a indexing system for representing geospatial data. For more details about it refer to https://eng.uber.com/h3.
Preprocessing¶
Ludwig will parse the H3 64bit encoded format automatically.
preprocessing:
missing_value_strategy: fill_with_const
fill_value: 576495936675512319
Parameters:
missing_value_strategy
(default:fill_with_const
) : What strategy to follow when there's a missing value in an h3 column Options:fill_with_const
,fill_with_mode
,bfill
,ffill
,drop_row
. See Missing Value Strategy for details.fill_value
(default:576495936675512319
): The value to replace missing values with in case the missing_value_strategy is fill_with_const
Preprocessing parameters can also be defined once and applied to all H3 input features using the Type-Global Preprocessing section.
Input Features¶
Input H3 features are transformed into a int valued tensors of size N x 19
(where N
is the size of the dataset and the 19 dimensions
represent 4 H3 resolution parameters (4) - mode, edge, resolution, base cell - and 15 cell coordinate values.
The encoder parameters specified at the feature level are:
tied
(defaultnull
): name of another input feature to tie the weights of the encoder with. It needs to be the name of a feature of the same type and with the same encoder parameters.
Example H3 feature entry in the input features list:
name: h3_feature_name
type: h3
tied: null
encoder:
type: embed
The available encoder parameters are:
type
(defaultembed
): the possible values areembed
,weighted_sum
, andrnn
.
Encoder type and encoder parameters can also be defined once and applied to all H3 input features using the Type-Global Encoder section.
Encoders¶
Embed Encoder¶
This encoder encodes each component of the H3 representation (mode, edge, resolution, base cell and children cells) with embeddings. Children cells with value 0
will be masked out. After the embedding, all embeddings are summed and optionally passed through a stack of fully connected layers.
encoder:
type: embed
dropout: 0.0
embedding_size: 10
output_size: 10
activation: relu
norm: null
use_bias: true
bias_initializer: zeros
weights_initializer: xavier_uniform
embeddings_on_cpu: false
reduce_output: sum
norm_params: null
num_fc_layers: 0
fc_layers: null
Parameters:
dropout
(default:0.0
) : Dropout probability for the embedding.embedding_size
(default:10
) : The maximum embedding size adopted.output_size
(default:10
) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.activation
(default:relu
): The default activation function that will be used for each layer. Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.norm
(default:null
): The default norm that will be used for each layer. Options:batch
,layer
,null
. See Normalization for details.use_bias
(default:true
): Whether the layer uses a bias vector. Options:true
,false
.bias_initializer
(default:zeros
): Initializer to use for the bias vector. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.weights_initializer
(default:xavier_uniform
): Initializer to use for the weights matrix. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.embeddings_on_cpu
(default:false
): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options:true
,false
.reduce_output
(default:sum
): How to reduce the output tensor along thes
sequence length dimension if the rank of the tensor is greater than 2. Options:last
,sum
,mean
,avg
,max
,concat
,attention
,none
,None
,null
.norm_params
(default:null
): Parameters used if norm is eitherbatch
orlayer
.num_fc_layers
(default:0
): The number of stacked fully connected layers.fc_layers
(default:null
): List of dictionaries containing the parameters for each fully connected layer.
Weighted Sum Embed Encoder¶
This encoder encodes each component of the H3 representation (mode, edge, resolution, base cell and children cells) with embeddings. Children cells with value 0
will be masked out. After the embedding, all embeddings are summed with a weighted sum (with learned weights) and optionally passed through a stack of fully connected layers.
encoder:
type: weighted_sum
dropout: 0.0
embedding_size: 10
output_size: 10
activation: relu
norm: null
use_bias: true
bias_initializer: zeros
weights_initializer: xavier_uniform
embeddings_on_cpu: false
should_softmax: false
norm_params: null
num_fc_layers: 0
fc_layers: null
Parameters:
dropout
(default:0.0
) : Dropout probability for the embedding.embedding_size
(default:10
) : The maximum embedding size adopted.output_size
(default:10
) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.activation
(default:relu
): The default activation function that will be used for each layer. Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.norm
(default:null
): The default norm that will be used for each layer. Options:batch
,layer
,null
. See Normalization for details.use_bias
(default:true
): Whether the layer uses a bias vector. Options:true
,false
.bias_initializer
(default:zeros
): Initializer to use for the bias vector. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.weights_initializer
(default:xavier_uniform
): Initializer to use for the weights matrix. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.embeddings_on_cpu
(default:false
): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options:true
,false
.should_softmax
(default:false
): Determines if the weights of the weighted sum should be passed though a softmax layer before being used. Options:true
,false
.norm_params
(default:null
): Parameters used if norm is eitherbatch
orlayer
.num_fc_layers
(default:0
): The number of stacked fully connected layers.fc_layers
(default:null
): List of dictionaries containing the parameters for each fully connected layer.
RNN Encoder¶
This encoder encodes each component of the H3 representation (mode, edge, resolution, base cell and children cells) with embeddings. Children cells with value 0
will be masked out. After the embedding, all embeddings are passed through an RNN encoder.
The intuition behind this is that, starting from the base cell, the sequence of children cells can be seen as a sequence encoding the path in the tree of all H3 hexes.
encoder:
type: rnn
dropout: 0.0
cell_type: rnn
num_layers: 1
embedding_size: 10
recurrent_dropout: 0.0
hidden_size: 10
bias_initializer: zeros
activation: tanh
recurrent_activation: sigmoid
unit_forget_bias: true
weights_initializer: xavier_uniform
recurrent_initializer: orthogonal
reduce_output: last
embeddings_on_cpu: false
use_bias: true
bidirectional: false
Parameters:
dropout
(default:0.0
) : The dropout ratecell_type
(default:rnn
) : The type of recurrent cell to use. Available values are:rnn
,lstm
,lstm_block
,lstm
,ln
,lstm_cudnn
,gru
,gru_block
,gru_cudnn
. For reference about the differences between the cells please refer to PyTorch's documentation. We suggest to use theblock
variants on CPU and thecudnn
variants on GPU because of their increased speed. Options:rnn
,lstm
,lstm_block
,ln
,lstm_cudnn
,gru
,gru_block
,gru_cudnn
.num_layers
(default:1
) : The number of stacked recurrent layers.embedding_size
(default:10
) : The maximum embedding size adopted.recurrent_dropout
(default:0.0
): The dropout rate for the recurrent statehidden_size
(default:10
): The size of the hidden representation within the transformer block. It is usually the same as the embedding_size, but if the two values are different, a projection layer will be added before the first transformer block.bias_initializer
(default:zeros
): Initializer to use for the bias vector. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.activation
(default:tanh
): The activation function to use Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.recurrent_activation
(default:sigmoid
): The activation function to use in the recurrent step Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.unit_forget_bias
(default:true
): If true, add 1 to the bias of the forget gate at initialization Options:true
,false
.weights_initializer
(default:xavier_uniform
): Initializer to use for the weights matrix. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.recurrent_initializer
(default:orthogonal
): The initializer for recurrent matrix weights Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.reduce_output
(default:last
): How to reduce the output tensor along thes
sequence length dimension if the rank of the tensor is greater than 2. Options:last
,sum
,mean
,avg
,max
,concat
,attention
,none
,None
,null
.-
embeddings_on_cpu
(default:false
): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options:true
,false
. -
use_bias
(default:true
): Whether to use a bias vector. Options:true
,false
. bidirectional
(default:false
): If true, two recurrent networks will perform encoding in the forward and backward direction and their outputs will be concatenated. Options:true
,false
.
Output Features¶
There is currently no support for H3 as an output feature. Consider using the TEXT
type.