↑ Time Series Features

Preprocessing¶

Timeseries features are handled as sequence features, with the only difference being that the matrix in the HDF5 preprocessing file uses floats instead of integers.

Since data is continuous, the JSON file, which typically stores vocabulary mappings, isn't needed.

Ludwig supports two data formats for timeseries:

Row-major (default): each row in the dataset is already a space-separated sequence of floats representing one complete window. Use timeseries_length_limit to cap the window size.
Column-major: each row is a single scalar observation. Ludwig converts to row-major automatically using a sliding window controlled by preprocessing.window_size (for inputs) or preprocessing.horizon (for outputs).

Input Features¶

Preprocessing¶

missing_value_strategy: fill_with_const
tokenizer: space
timeseries_length_limit: 256
padding_value: 0.0
padding: right
fill_value: ''
computed_fill_value: ''
window_size: 0

Parameters:

missing_value_strategy (default: fill_with_const) : What strategy to follow when a row of data is missing. Options: fill_with_const, fill_with_mode, bfill, ffill, drop_row.
tokenizer (default: space):
timeseries_length_limit (default: 256):
padding_value (default: 0.0):
padding (default: right):
fill_value (default: ``):
computed_fill_value (default: ``):
window_size (default: 0): Optional lookback window size used to convert a column-major dataset (one observation per row) into a row-major dataset (each row has a timeseries window of observations). Starting from a given observation, a sliding window is taken going window_size - 1 rows back to form the timeseries input feature. If this value is left as 0, then it is assumed that the dataset has been provided in row-major format (i.e., it has already been preprocessed such that each row is a timeseries window).

Column-major preprocessing with `window_size`¶

When your dataset has one observation per row (column-major), set window_size to the number of past observations each input window should span:

input_features:
  - name: temperature
    type: timeseries
    preprocessing:
      window_size: 24   # use the last 24 observations as context
      padding_value: 0.0

Ludwig will slide a window of length window_size over the column and produce one row-major embedding per observation, padding the beginning of the series with padding_value.

Encoders¶

Sequence Encoders¶

Time series encoders are the same as for Sequence Features, with one exception:

Time series features don't have an embedding layer at the beginning, so the b x s placeholders (where b is the batch size and s is the sequence length) are directly mapped to a b x s x 1 tensor and then passed to the different sequential encoders.

The encoder parameters specified at the feature level are:

tied (default null): name of another input feature to tie the weights of the encoder with. It needs to be the name of a feature of the same type and with the same encoder parameters.

Example timeseries input feature:

name: timeseries_column_name
type: timeseries
tied: null
encoder:
    type: parallel_cnn

Passthrough Encoder¶

graph LR
  A["12\n7\n43\n65\n23\n4\n1"] --> B["Cast float32"];
  B --> C["Aggregation\n Reduce\n Operation"];
  C --> ...;

The passthrough encoder simply transforms each input value into a float value and adds a dimension to the input tensor, creating a b x s x 1 tensor where b is the batch size and s is the length of the sequence. The tensor is reduced along the s dimension to obtain a single vector of size h for each element of the batch. If you want to output the full b x s x h tensor, you can specify reduce_output: null. This is useful for timeseries features when you want to pass the raw window directly to a downstream combiner such as the sequence combiner.

encoder:
    type: passthrough
    encoding_size: null
    reduce_output: null
    skip: false
    adapter: null

Parameters:

encoding_size (default: null): The size of the encoding vector, or None if sequence elements are scalars.
reduce_output (default: null): How to reduce the output tensor along the s sequence length dimension if the rank of the tensor is greater than 2. Options: last, sum, mean, avg, max, concat, attention, attention_pooling, none, None, null.
skip (default: false):
adapter (default: null):

Output Features¶

Ludwig supports timeseries as an output feature for forecasting tasks. The decoder projects the combined representation to a vector of length horizon — one predicted value per future timestep. All steps are predicted simultaneously in a single forward pass (direct multi-step forecasting).

Preprocessing¶

missing_value_strategy: drop_row
tokenizer: space
timeseries_length_limit: 256
padding_value: 0.0
padding: right
fill_value: ''
computed_fill_value: ''
horizon: 0

Parameters:

missing_value_strategy (default: drop_row) : What strategy to follow when a row of data is missing. Options: fill_with_const, fill_with_mode, bfill, ffill, drop_row.
tokenizer (default: space):
timeseries_length_limit (default: 256):
padding_value (default: 0.0):
padding (default: right):
fill_value (default: ``):
computed_fill_value (default: ``):
horizon (default: 0): Optional forecasting horizon used to convert a column-major dataset (one observation per row) into a row-major dataset (each row has a timeseries window of observations). Starting from a given observation, a sliding window is token going horizon rows forward in time, excluding the observation in the current row. If this value is left as 0, then it is assumed that the dataset has been provided in row-major format (i.e., it has already been preprocessed such that each row is a timeseries window).

Column-major preprocessing with `horizon`¶

When using column-major data, set horizon on the output feature to tell Ludwig how many steps ahead each training target spans:

output_features:
  - name: temperature
    type: timeseries
    preprocessing:
      horizon: 12   # predict the next 12 observations
    decoder:
      type: projector

The input feature must share the same column name (Ludwig uses it to align input windows with output targets):

input_features:
  - name: temperature
    type: timeseries
    preprocessing:
      window_size: 24
output_features:
  - name: temperature
    type: timeseries
    preprocessing:
      horizon: 12
    decoder:
      type: projector

Decoders¶

Projector¶

The projector decoder is a fully-connected layer (or stack of FC layers) that maps the combiner output to a vector of size horizon. This is the recommended decoder for timeseries output.

output_features:
  - name: temperature
    type: timeseries
    decoder:
      type: projector

Loss¶

The default loss for timeseries output features is Huber loss, which is robust to outliers compared to MSE. You can override it:

output_features:
  - name: temperature
    type: timeseries
    loss:
      type: mean_squared_error

Forecasting with `model.forecast()`¶

After training a model with timeseries input and output features, use model.forecast() to generate multi-step predictions from a seed dataset:

import pandas as pd
from ludwig.api import LudwigModel

model = LudwigModel.load("results/experiment_run/model")

# Seed data — must contain enough rows to fill the input window_size.
# Only the last window_size rows are used as context.
seed_df = pd.read_csv("recent_observations.csv")

# Predict 48 steps ahead, iteratively sliding the window.
forecast_df = model.forecast(seed_df, horizon=48)
print(forecast_df)
# Returns a DataFrame with one column per timeseries output feature,
# and one row per forecasted timestep.

model.forecast() uses an efficient incremental strategy: it preprocesses the initial window once, then slides each new prediction into the window in O(1) per step rather than re-running full preprocessing.

Parameters:

Parameter	Type	Default	Description
`dataset`	DataFrame / path	required	Seed data containing at least `window_size` rows
`horizon`	int	`1`	Number of timesteps to forecast ahead
`data_format`	str	`"auto"`	Dataset format (csv, parquet, etc.)
`output_directory`	str	`None`	If set, saves forecast results here
`output_format`	str	`"parquet"`	Format for saved results

Note

model.forecast() requires the model to have at least one timeseries input feature and at least one timeseries output feature. If the output feature column name matches the input feature column name, Ludwig automatically feeds each predicted value back as input for the next step.

↑ Time Series Features

Preprocessing¶

Input Features¶

Preprocessing¶

Column-major preprocessing with window_size¶

Encoders¶

Sequence Encoders¶

Passthrough Encoder¶

Output Features¶

Preprocessing¶

Column-major preprocessing with horizon¶

Decoders¶

Projector¶

Loss¶

Forecasting with model.forecast()¶

Column-major preprocessing with `window_size`¶

Column-major preprocessing with `horizon`¶

Forecasting with `model.forecast()`¶