↑ Date Features
Date features are like 2023-06-25 15:00:00
, 2023-06-25
, 6-25-2023
, or 6/25/2023
.
Preprocessing¶
Ludwig will try to infer the date format automatically, but a specific format can be provided. The date string spec is the same as the one described in python's datetime.
preprocessing:
missing_value_strategy: fill_with_const
datetime_format: null
fill_value: ''
name: date_feature_name
type: date
preprocessing:
missing_value_strategy: fill_with_const
fill_value: ''
datetime_format: "%d %b %Y"
Parameters:
missing_value_strategy
(default:fill_with_const
) : What strategy to follow when there's a missing value in a date column Options:fill_with_const
,bfill
,ffill
,drop_row
. See Missing Value Strategy for details.datetime_format
(default:null
): This parameter can either be a datetime format string, or null, in which case the datetime format will be inferred automatically.fill_value
(default: ``): The value to replace missing values with in case the missing_value_strategy is fill_with_const
Preprocessing parameters can also be defined once and applied to all date input features using the Type-Global Preprocessing section.
Input Features¶
Input date features are transformed into a int tensors of size N x 9
(where N
is the size of the dataset and the 9 dimensions contain year, month, day, weekday, yearday, hour, minute, second, and second of day).
For example, the date 2022-06-25 09:30:59
would be deconstructed into:
[
2022, # Year
6, # June
25, # 25th day of the month
5, # Weekday: Saturday
176, # 176th day of the year
9, # Hour
30, # Minute
59, # Seconds
34259, # 34259th second of the day
]
The encoder parameters specified at the feature level are:
tied
(defaultnull
): name of another input feature to tie the weights of the encoder with. It needs to be the name of a feature of the same type and with the same encoder parameters.
Currently there are two encoders supported for dates: DateEmbed
(default) and DateWave
. The encoder can be set by specifying embed
or wave
in the feature's encoder
parameter in the input feature's configuration.
Example date feature entry in the input features list:
name: date_feature_name
type: date
encoder:
type: embed
Encoder type and encoder parameters can also be defined once and applied to all date input features using the Type-Global Encoder section.
Encoders¶
Embed Encoder¶
This encoder passes the year through a fully connected layer of one neuron and embeds all other elements for the date, concatenates them and passes the concatenated representation through fully connected layers.
encoder:
type: embed
dropout: 0.0
embedding_size: 10
output_size: 10
activation: relu
norm: null
use_bias: true
bias_initializer: zeros
weights_initializer: xavier_uniform
embeddings_on_cpu: false
norm_params: null
num_fc_layers: 0
fc_layers: null
Parameters:
dropout
(default:0.0
) : Dropout probability for the embedding.embedding_size
(default:10
) : The maximum embedding size adopted.output_size
(default:10
) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.activation
(default:relu
): The default activation function that will be used for each layer. Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.norm
(default:null
): The default norm that will be used for each layer. Options:batch
,layer
,null
. See Normalization for details.use_bias
(default:true
): Whether the layer uses a bias vector. Options:true
,false
.bias_initializer
(default:zeros
): Initializer to use for the bias vector. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.weights_initializer
(default:xavier_uniform
): Initializer to use for the weights matrix. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.embeddings_on_cpu
(default:false
): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options:true
,false
.norm_params
(default:null
): Parameters used if norm is eitherbatch
orlayer
. See Normalization for details.num_fc_layers
(default:0
): The number of stacked fully connected layers.fc_layers
(default:null
): List of dictionaries containing the parameters for each fully connected layer.
Wave Encoder¶
This encoder passes the year through a fully connected layer of one neuron and represents all other elements for the date by taking the cosine of their value with a different period (12 for months, 31 for days, etc.), concatenates them and passes the concatenated representation through fully connected layers.
encoder:
type: wave
dropout: 0.0
output_size: 10
activation: relu
norm: null
use_bias: true
bias_initializer: zeros
weights_initializer: xavier_uniform
norm_params: null
num_fc_layers: 1
fc_layers: null
Parameters:
dropout
(default:0.0
) : Dropout probability for the embedding.output_size
(default:10
) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.activation
(default:relu
): The default activation function that will be used for each layer. Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.norm
(default:null
): The default norm that will be used for each layer. Options:batch
,layer
,null
. See Normalization for details.use_bias
(default:true
): Whether the layer uses a bias vector. Options:true
,false
.bias_initializer
(default:zeros
): Initializer to use for the bias vector. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.weights_initializer
(default:xavier_uniform
): Initializer to use for the weights matrix. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.norm_params
(default:null
): Parameters used if norm is eitherbatch
orlayer
. See Normalization for details.num_fc_layers
(default:1
): The number of stacked fully connected layers.fc_layers
(default:null
): List of dictionaries containing the parameters for each fully connected layer.
Output Features¶
There is currently no support for date as an output feature. Consider using the TEXT
type.