Model Types

The top-level model_type parameter specifies the type of model to use.

The following model types are supported:

ecd (default): Encoder-Combiner-Decoder neural network model.
llm: Large Language Model for text generation.
gbm: Gradient Boosting Machine tree-based model.

model_type: ecd

Every model type has trainers associated with it. See the Trainer section for details about the supported training algorithms per model type.

Model Type: ECD¶

See the ECD documentation for details about the Encoder-Combiner-Decoder deep learning architecture.

Check

The full breadth of Ludwig functionality is available for the ecd model type.

Model Type: LLM¶

The LLM model type is a large language model for text generation. Large language models like Llama-2 can also be used as text encoders in the ECD model type, but for ECD the language model head is removed such that the hidden state (embeddings) are used as output. The LLM model type, by contrast, retains the LM head, which is then used for token generation.

The LLM model type supports all pretrained HuggingFace Causal LM models from the HuggingFace Hub.

Attention

Selecting the llm model type introduces the following limitations:

only a single text input and text output feature is supported (for now)
the combiner section is ignored

Model Type: GBM¶

The GBM model type is a gradient boosting machine (GBM) tree model. It is a tree model that is trained using a supported tree learner. Currently, the only supported tree learner is LightGBM.

Attention

Selecting the gbm model type introduces the following limitations:

only binary, category, and number features are supported
only a single output feature is supported
the combiner section is ignored