↑ Date Features
Date features are like 2023-06-25 15:00:00
, 2023-06-25
, 6-25-2023
, or 6/25/2023
.
Preprocessing¶
Ludwig will try to infer the date format automatically, but a specific format can be provided. The date string spec is the same as the one described in python's datetime.
preprocessing:
missing_value_strategy: fill_with_const
fill_value: ''
datetime_format: null
name: date_feature_name
type: date
preprocessing:
missing_value_strategy: fill_with_const
fill_value: ''
datetime_format: "%d %b %Y"
Parameters:
missing_value_strategy
(default:fill_with_const
) : What strategy to follow when there's a missing value in a date column Options:fill_with_const
,bfill
,ffill
,drop_row
. See Missing Value Strategy for details.fill_value
(default: ``): The value to replace missing values with in case the missing_value_strategy is fill_with_constdatetime_format
(default:null
): This parameter can either be a datetime format string, or null, in which case the datetime format will be inferred automatically.
Preprocessing parameters can also be defined once and applied to all date input features using the Type-Global Preprocessing section.
Input Features¶
Input date features are transformed into a int tensors of size N x 9
(where N
is the size of the dataset and the 9 dimensions contain year, month, day, weekday, yearday, hour, minute, second, and second of day).
For example, the date 2022-06-25 09:30:59
would be deconstructed into:
[
2022, # Year
6, # June
25, # 25th day of the month
5, # Weekday: Saturday
176, # 176th day of the year
9, # Hour
30, # Minute
59, # Seconds
34259, # 34259th second of the day
]
The encoder parameters specified at the feature level are:
tied
(defaultnull
): name of another input feature to tie the weights of the encoder with. It needs to be the name of a feature of the same type and with the same encoder parameters.
Currently there are two encoders supported for dates: DateEmbed
(default) and DateWave
. The encoder can be set by specifying embed
or wave
in the feature's encoder
parameter in the input feature's configuration.
Example date feature entry in the input features list:
name: date_feature_name
type: date
encoder:
type: embed
Encoder type and encoder parameters can also be defined once and applied to all date input features using the Type-Global Encoder section.
Encoders¶
Embed Encoder¶
This encoder passes the year through a fully connected layer of one neuron and embeds all other elements for the date, concatenates them and passes the concatenated representation through fully connected layers.
encoder:
type: embed
dropout: 0.0
embedding_size: 10
output_size: 10
activation: relu
norm: null
use_bias: true
bias_initializer: zeros
weights_initializer: xavier_uniform
embeddings_on_cpu: false
norm_params: null
num_fc_layers: 0
fc_layers: null
Parameters:
dropout
(default:0.0
) : Dropout probability for the embedding.embedding_size
(default:10
) : The maximum embedding size adopted.output_size
(default:10
) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.activation
(default:relu
): The default activation function that will be used for each layer. Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.norm
(default:null
): The default norm that will be used for each layer. Options:batch
,layer
,null
. See Normalization for details.use_bias
(default:true
): Whether the layer uses a bias vector. Options:true
,false
.bias_initializer
(default:zeros
): Initializer to use for the bias vector. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.weights_initializer
(default:xavier_uniform
): Initializer to use for the weights matrix. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.embeddings_on_cpu
(default:false
): Whether to force the placement of the embedding matrix in regular memory and have the CPU resolve them. Options:true
,false
.norm_params
(default:null
): Parameters used if norm is eitherbatch
orlayer
. See Normalization for details.num_fc_layers
(default:0
): The number of stacked fully connected layers.fc_layers
(default:null
): List of dictionaries containing the parameters for each fully connected layer.
Wave Encoder¶
This encoder passes the year through a fully connected layer of one neuron and represents all other elements for the date by taking the cosine of their value with a different period (12 for months, 31 for days, etc.), concatenates them and passes the concatenated representation through fully connected layers.
encoder:
type: wave
dropout: 0.0
output_size: 10
activation: relu
norm: null
use_bias: true
bias_initializer: zeros
weights_initializer: xavier_uniform
norm_params: null
num_fc_layers: 1
fc_layers: null
Parameters:
dropout
(default:0.0
) : Dropout probability for the embedding.output_size
(default:10
) : If an output_size is not already specified in fc_layers this is the default output_size that will be used for each layer. It indicates the size of the output of a fully connected layer.activation
(default:relu
): The default activation function that will be used for each layer. Options:elu
,leakyRelu
,logSigmoid
,relu
,sigmoid
,tanh
,softmax
,null
.norm
(default:null
): The default norm that will be used for each layer. Options:batch
,layer
,null
. See Normalization for details.use_bias
(default:true
): Whether the layer uses a bias vector. Options:true
,false
.bias_initializer
(default:zeros
): Initializer to use for the bias vector. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.weights_initializer
(default:xavier_uniform
): Initializer to use for the weights matrix. Options:uniform
,normal
,constant
,ones
,zeros
,eye
,dirac
,xavier_uniform
,xavier_normal
,kaiming_uniform
,kaiming_normal
,orthogonal
,sparse
,identity
.norm_params
(default:null
): Parameters used if norm is eitherbatch
orlayer
. See Normalization for details.num_fc_layers
(default:1
): The number of stacked fully connected layers.fc_layers
(default:null
): List of dictionaries containing the parameters for each fully connected layer.
Output Features¶
There is currently no support for date as an output feature. Consider using the TEXT
type.