Ludwig is an open-source declarative deep learning framework built on PyTorch. You define your model, data, and training pipeline in a YAML config file — Ludwig handles the rest, from preprocessing to training to deployment.

v0.17 — lazy media preprocessing, GRPO, VLM fine-tuning, prefetch pipeline

Train anything.
One config file.

Ludwig is the open-source declarative deep learning framework for building, fine-tuning, and deploying custom models — across tabular data, text, images, and audio — without writing training loops.

Get Started GitHub

$ pip install ludwig

10k+ GitHub Stars

50+ Modalities & Tasks

Zero Boilerplate

Apache 2 License

LF AI Hosted Project

What is Ludwig

ML without the
engineering overhead.

Most ML projects spend 80% of their time on infrastructure — data loaders, training loops, distributed setup, export pipelines. Ludwig inverts this. You describe your model in YAML. Ludwig builds and runs it.

You get the simplicity of AutoML with the flexibility of raw PyTorch. Expert-level control is always one config field away — but you never have to touch it if you don't need to.

🐧 Linux Foundation AI & Data 🔥 Built on PyTorch 🤗 HuggingFace Native ⚡ Ray Ready

model.yaml — sentiment classifier

# describe your model in YAML
input_features:
  - name: review_text
    type: text
    encoder:
      type: bert

output_features:
  - name: sentiment
    type: category

trainer:
  epochs: 5
  learning_rate: 2.0e-5

terminal

$ ludwig train --config model.yaml \
               --dataset reviews.csv

✓ Preprocessing ── 12,500 rows
✓ Training ──────── epoch 5/5
✓ Test accuracy ─── 94.2%
✓ Saved ─────────── results/

Features

Everything you need.
Nothing you don't.

From prototype to production — Ludwig handles the full ML lifecycle without requiring you to write infrastructure code.

⚙️

Declarative Configuration

Define your entire ML pipeline — preprocessing, encoders, architecture, training, HPO — in a single validated YAML file. No boilerplate training loops.

YAMLPydantic v2Schema validation

🧬

Multi-Modal & Multi-Task

Mix tabular features, raw text, images, audio, and time series in a single model. Train multiple outputs simultaneously. No other framework makes this as seamless.

TextImageAudioTabular

🦙

LLM Fine-Tuning Suite

SFT, DPO, KTO, ORPO, and GRPO in one framework. LoRA, QLoRA, DoRA, VeRA with 4-bit quantization. Merge multi-adapter models with TIES, DARE, or SVD.

LoRAQLoRAGRPODPO

☁️

Scale Without Code Changes

Add a backend: ray line and your local job becomes a distributed Ray cluster job. Zero rewrites, zero surprises.

RayDeepSpeedFSDPKubeRay

🔍

Hyperparameter Optimization

Built-in HPO with Ray Tune and Optuna. Auto, TPE, GP, and CMA-ES samplers. SQLite or PostgreSQL persistence. No external orchestration needed.

OptunaRay TuneAutoML

🚀

Production-Ready Serving

One command to serve your model as a REST API. Export to SafeTensors, ONNX, or torch.export. Prebuilt Docker images for CPU, GPU, Ray. Upload to HuggingFace Hub.

FastAPIvLLMONNXDocker

📊

Rich Experiment Tracking

First-class integrations with W&B, MLflow, TensorBoard, Comet ML, and Aim. Auto-generated training reports and visualizations out of the box.

W&BMLflowTensorBoard

🧩

Fully Extensible

Plug in custom encoders, decoders, combiners, loss functions, and metrics. Use any HuggingFace model as a backbone. Ludwig is a framework, not a cage.

Custom encodersHuggingFacePyTorch

🤖

AutoML & Explainability

One-line auto_train() finds a strong baseline automatically. Built-in feature importance, model explainability, and rich visualizations help you understand what the model learned.

AutoMLSHAPVisualizations

Data types

Every modality. One framework.

Ludwig handles data preprocessing for every feature type automatically — no custom pipelines required.

📝

Text

BERT, GPT-2, LLaMA, any HuggingFace model as encoder

🔢

Tabular

Numeric, categorical, binary, set, bag — all handled

🖼️

Image

ResNet, EfficientNet, DINOv2, ViT — classification & segmentation

🔊

Audio

Wave2Vec, raw audio encoders, speech recognition

📈

Time Series

Forecasting with efficient O(window) preprocessing

🗺️

Geospatial

H3 hexagonal indexing, location features

🧮

Vectors

Dense embedding vectors and learned representations

📅

Date / Time

Automatic cyclic encoding of temporal features

🔗

Sequences

Token sequences, genomic data, protein chains, discrete symbols

🔬

Anomalies

Deep SVDD, Deep SAD, DROCC — unsupervised anomaly scoring

LLM Fine-tuning

Fine-tune any LLM.
No GPU expertise needed.

Ludwig gives you the complete alignment stack — from supervised instruction tuning to reward-model-free RL — in a single YAML file. Run Llama, Mistral, Qwen, or any HuggingFace model with 4-bit QLoRA on a single consumer GPU.

v0.16 added PatchTST & N-BEATS for timeseries forecasting, Nash-MTL/Pareto-MTL for multi-task loss balancing, and LLM-powered config generation. v0.17 adds lazy media preprocessing (on-the-fly audio/image decoding), a prefetch pipeline for GPU saturation, VLM fine-tuning, and GRPO alignment.

Supervised Fine-Tuning DPO KTO ORPO GRPO (new) LoRA / DoRA / VeRA 4-bit QLoRA Multi-adapter merge Sequence packing Paged optimizers vLLM serving HF Hub export

llm_finetune.yaml — QLoRA instruction tuning

model_type: llm
base_model: meta-llama/Llama-3.1-8B

quantization:
  bits: 4

adapter:
  type: lora
  r: 16
  alpha: 32

input_features:
  - name: instruction
    type: text

output_features:
  - name: response
    type: text

trainer:
  type: finetune
  epochs: 3
  learning_rate: 1.0e-4
  batch_size: 1
  gradient_accumulation_steps: 16

terminal

$ ludwig train --config llm_finetune.yaml \
               --dataset "ludwig://alpaca"
$ ludwig serve --model_path results/

scale to Ray — 4 extra lines

input_features:
  - name: text
    type: text
output_features:
  - name: label
    type: category
trainer:
  epochs: 10

# ↓ add this to go distributed
backend:
  type: ray
  trainer:
    use_gpu: true
    num_workers: 8
    strategy: fsdp

Distributed training

Laptop to cluster.
Zero rewrites.

Ludwig's backend config key transparently handles data sharding, gradient sync, and checkpointing — without changing your model definition.

local

Develop on your machine

Train locally with CPU or GPU. Fast iteration, no setup.

ray

Scale to Ray cluster

One config change. DDP, FSDP, or DeepSpeed — your call.

k8s

Deploy on Kubernetes

Native KubeRay integration for production cluster scheduling.

Integrations

Works with your stack.

🤗 HuggingFace Transformers ⚡ Ray 🔥 PyTorch 📊 Weights & Biases 📈 MLflow 📉 TensorBoard 🧪 Optuna 🎛 Ray Tune 🐳 Docker ☸️ Kubernetes 🚀 vLLM 🔗 DeepSpeed 📦 ONNX 🛡️ SafeTensors 🌊 Dask 🏹 PyArrow ☄️ Comet ML 🎯 Aim

Get started

Up and running in minutes.

Install Ludwig, pick a use case, run one command. That's it.

text_classifier.yaml — sentiment analysis with ModernBERT

input_features:
  - name: review_content
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base   # or: roberta-base, deberta-v3-base…

output_features:
  - name: recommended
    type: binary

trainer:
  epochs: 5
  learning_rate: 2.0e-5

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config text_classifier.yaml --dataset rotten_tomatoes.csv
$ ludwig predict --model_path results/ --dataset new_reviews.csv
$ ludwig serve --model_path results/

ner.yaml — named entity recognition with ModernBERT

input_features:
  - name: sentence
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base

output_features:
  - name: entities
    type: sequence
    decoder:
      type: tagger          # token-level labels (BIO tagging)
    preprocessing:
      tokenizer: space

trainer:
  epochs: 10
  learning_rate: 2.0e-5

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config ner.yaml --dataset conll2003.csv
$ ludwig predict --model_path results/ --dataset news.csv

summarize.yaml — abstractive summarization with BART

input_features:
  - name: article
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: facebook/bart-large-cnn
      reduce_output: null   # keep all token states for generation

output_features:
  - name: summary
    type: text
    decoder:
      type: generator
      max_new_tokens: 128

trainer:
  epochs: 5
  learning_rate: 3.0e-5
  batch_size: 8

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config summarize.yaml --dataset "ludwig://cnn_dailymail"

text_regression.yaml — predict star rating from review text

input_features:
  - name: review
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true

output_features:
  - name: star
    type: number      # regression — RMSE/MAE metrics out of the box

trainer:
  epochs: 5
  learning_rate: 1.0e-5

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config text_regression.yaml --dataset app_reviews.csv

qa.yaml — extractive question answering on SQuAD

input_features:
  - name: context
    type: text
    encoder:
      type: auto_transformer
      pretrained_model_name_or_path: google-bert/bert-base-uncased
      trainable: true
  - name: question
    type: text
    encoder:
      type: auto_transformer
      pretrained_model_name_or_path: google-bert/bert-base-uncased
      trainable: true

output_features:
  - name: answers
    type: text
    decoder:
      type: generator
      max_new_tokens: 32

combiner:
  type: concat

trainer:
  epochs: 5
  learning_rate: 2.0e-5

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config qa.yaml --dataset "ludwig://squad"

translation.yaml — English → German with mBART

input_features:
  - name: en
    type: text
    encoder:
      type: auto_transformer
      pretrained_model_name_or_path: facebook/mbart-large-cc25
      trainable: true
      max_sequence_length: 128

output_features:
  - name: de
    type: text
    decoder:
      type: generator
      max_new_tokens: 128

trainer:
  epochs: 3
  batch_size: 16
  gradient_accumulation_steps: 4
  learning_rate: 3.0e-5

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config translation.yaml --dataset "ludwig://opus100_en_de"

nli.yaml — natural language inference on MultiNLI

input_features:
  - name: premise
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true
  - name: hypothesis
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true

output_features:
  - name: label
    type: category   # entailment / neutral / contradiction

combiner:
  type: concat

trainer:
  epochs: 5
  learning_rate: 2.0e-5
  batch_size: 32

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config nli.yaml --dataset "ludwig://mnli"

sts.yaml — sentence similarity (STS-B, scored 0–5)

input_features:
  - name: sentence1
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true
  - name: sentence2
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true

output_features:
  - name: score
    type: number

combiner:
  type: concat

trainer:
  epochs: 5
  learning_rate: 2.0e-5
  batch_size: 32

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config sts.yaml --dataset "ludwig://stsb"

code.yaml — bug detection with CodeBERT

input_features:
  - name: func
    type: text
    encoder:
      type: auto_transformer
      pretrained_model_name_or_path: microsoft/codebert-base
      trainable: true

output_features:
  - name: target
    type: binary    # 1 = buggy, 0 = clean

trainer:
  epochs: 10
  learning_rate: 2.0e-5
  batch_size: 32

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config code.yaml --dataset "ludwig://code_defect_detection"

multilingual.yaml — intent classification across 51 languages

input_features:
  - name: utt
    type: text
    encoder:
      type: auto_transformer
      pretrained_model_name_or_path: google-bert/bert-base-multilingual-cased
      trainable: true

output_features:
  - name: intent
    type: category

trainer:
  epochs: 10
  learning_rate: 2.0e-5
  batch_size: 64

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config multilingual.yaml --dataset "ludwig://amazon_massive_intent"

safety.yaml — content safety classification (prompt + response)

input_features:
  - name: prompt
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true
  - name: response
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true

output_features:
  - name: safety_label
    type: category

trainer:
  epochs: 5
  learning_rate: 2.0e-5
  batch_size: 32

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config safety.yaml --dataset "ludwig://aegis_safety"

llm_sft.yaml — QLoRA instruction tuning on Llama-3

model_type: llm
base_model: meta-llama/Llama-3.1-8B-Instruct

quantization:
  bits: 4   # 4-bit QLoRA — runs on a single consumer GPU

adapter:
  type: lora
  r: 16
  alpha: 32

prompt:
  template: |
    ### Instruction:
    {instruction}
    ### Response:

input_features:
  - name: instruction
    type: text

output_features:
  - name: response
    type: text

trainer:
  type: finetune
  epochs: 3
  learning_rate: 1.0e-4
  batch_size: 1
  gradient_accumulation_steps: 16

─────────────────────────────────────────────
$ pip install ludwig[llm]
$ ludwig train --config llm_sft.yaml --dataset "ludwig://alpaca"
$ ludwig upload_to_hf_hub --model_path results/ --repo my-org/my-model

grpo.yaml — reward-model-free RL alignment (v0.15+)

model_type: llm
base_model: Qwen/Qwen3-8B

adapter:
  type: lora

quantization:
  bits: 4

input_features:
  - name: prompt
    type: text

output_features:
  - name: response
    type: text

trainer:
  type: grpo            # Group Relative Policy Optimization
  reward_fn: accuracy   # custom or built-in reward signal
  num_generations: 8
  epochs: 1
  learning_rate: 5.0e-6

─────────────────────────────────────────────
$ pip install ludwig[llm]
$ ludwig train --config grpo.yaml --dataset math_problems.csv

multimodal.yaml — product classification (image + text + tabular)

# mix any modalities — Ludwig handles the rest
input_features:
  - name: product_image
    type: image
    encoder:
      type: dinov2

  - name: description
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base

  - name: price
    type: number

  - name: brand
    type: category

output_features:
  - name: category
    type: category
  - name: is_premium
    type: binary     # multi-task: two outputs at once

trainer:
  epochs: 10

─────────────────────────────────────────────
$ pip install "ludwig[text,vision]"
$ ludwig train --config multimodal.yaml --dataset products.csv

image_classifier.yaml — image classification with DINOv2

input_features:
  - name: image_path
    type: image
    encoder:
      type: dinov2         # or: vit, efficientnet, resnet
      use_pretrained: true
      trainable: true

output_features:
  - name: label
    type: category

trainer:
  epochs: 20
  learning_rate: 1.0e-4
  learning_rate_scheduler:
    decay: cosine

─────────────────────────────────────────────
$ pip install "ludwig[vision]"
$ ludwig train --config image_classifier.yaml --dataset images.csv
$ ludwig visualize --visualization confusion_matrix --ground_truth test.csv

segmentation.yaml — semantic segmentation with ConvNeXtV2 + U-Net

input_features:
  - name: image
    type: image
    encoder:
      type: convnextv2
      model_name: convnextv2_tiny
      use_pretrained: true
      trainable: true

output_features:
  - name: mask
    type: image
    decoder:
      type: unet
      num_classes: 2
    loss:
      type: softmax_cross_entropy

preprocessing:
  image:
    height: 256
    width: 256

trainer:
  epochs: 30
  batch_size: 16
  use_mixed_precision: true

─────────────────────────────────────────────
$ pip install "ludwig[vision]"
$ ludwig train --config segmentation.yaml --dataset satellite_buildings.csv

docvqa.yaml — document image question answering

input_features:
  - name: image_path
    type: image
    encoder:
      type: vit
      trainable: true
  - name: question
    type: text
    encoder:
      type: bert
      trainable: true

output_features:
  - name: answer
    type: text
    decoder:
      type: generator
      max_new_tokens: 64

combiner:
  type: concat

trainer:
  epochs: 5
  batch_size: 8
  learning_rate: 1.0e-5

─────────────────────────────────────────────
$ pip install "ludwig[text,vision]"
$ ludwig train --config docvqa.yaml --dataset "ludwig://docvqa"

vqa.yaml — visual question answering (image + text → text)

input_features:
  - name: image_path
    type: image
    encoder:
      type: dinov2
      reduce_output: null   # keep spatial tokens for cross-attention

  - name: question
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base

output_features:
  - name: answer
    type: text
    decoder:
      type: generator

trainer:
  epochs: 20
  learning_rate: 1.0e-4

─────────────────────────────────────────────
$ pip install "ludwig[text,vision]"
$ ludwig train --config vqa.yaml --dataset vqa_v2.csv

vlm.yaml — QLoRA fine-tuning of Qwen2-VL on custom VQA data

model_type: llm
base_model: Qwen/Qwen2-VL-7B-Instruct
is_multimodal: true
trust_remote_code: true

adapter:
  type: lora
  r: 16
  alpha: 32
  target_modules: ["q_proj", "v_proj"]

quantization:
  bits: 4

input_features:
  - name: image_path
    type: image
  - name: question
    type: text

output_features:
  - name: answer
    type: text

trainer:
  type: finetune
  epochs: 3
  batch_size: 4
  gradient_accumulation_steps: 8
  learning_rate: 2.0e-5

─────────────────────────────────────────────
$ pip install "ludwig[llm,vision]"
$ ludwig train --config vlm.yaml --dataset "ludwig://scienceqa"

audio_classifier.yaml — sound event classification

input_features:
  - name: audio_path
    type: audio
    preprocessing:
      audio_feature:
        type: fbank          # log-mel filterbanks
        window_length_in_s: 0.025
        window_shift_in_s: 0.010
        num_filter_bands: 80
      audio_file_length_limit_in_s: 10.0
    encoder:
      type: cnnrnn          # or: stacked_cnn, bert (wav2vec2)

output_features:
  - name: label
    type: category

trainer:
  epochs: 30
  learning_rate: 1.0e-3

─────────────────────────────────────────────
$ pip install "ludwig[audio]"
$ ludwig train --config audio_classifier.yaml --dataset "ludwig://fsd50k"

asr.yaml — automatic speech recognition (audio → text)

input_features:
  - name: audio_path
    type: audio
    preprocessing:
      audio_feature:
        type: fbank
        window_length_in_s: 0.025
        window_shift_in_s: 0.010
        num_filter_bands: 80
      audio_file_length_limit_in_s: 30.0
    encoder:
      type: cnnrnn
      reduce_output: null   # keep frame sequence for decoder

output_features:
  - name: transcript
    type: text
    decoder:
      type: generator
      attention: true

trainer:
  epochs: 50
  learning_rate: 1.0e-3

─────────────────────────────────────────────
$ pip install "ludwig[audio,text]"
$ ludwig train --config asr.yaml --dataset "ludwig://librispeech_asr"

speaker.yaml — speaker verification with WavLM

input_features:
  - name: audio
    type: audio
    preprocessing:
      audio_file_length_limit_in_s: 5.0
    encoder:
      type: auto_transformer
      pretrained_model_name_or_path: microsoft/wavlm-base-plus
      trainable: true

output_features:
  - name: speaker_id
    type: category

trainer:
  epochs: 20
  batch_size: 32
  learning_rate: 1.0e-5

─────────────────────────────────────────────
$ pip install "ludwig[audio]"
$ ludwig train --config speaker.yaml --dataset "ludwig://voxceleb"

fraud_detection.yaml — tabular classification

input_features:
  - name: transaction_amount
    type: number
  - name: merchant_category
    type: category
  - name: hour_of_day
    type: number
  - name: country
    type: category
  - name: is_new_device
    type: binary

output_features:
  - name: is_fraud
    type: binary

trainer:
  epochs: 30
  class_weights: balanced  # handles class imbalance

─────────────────────────────────────────────
$ pip install ludwig
$ ludwig train --config fraud_detection.yaml --dataset transactions.csv

regression.yaml — house price prediction (numerical output)

input_features:
  - name: sqft
    type: number
  - name: bedrooms
    type: number
  - name: bathrooms
    type: number
  - name: neighborhood
    type: category
  - name: year_built
    type: number
  - name: description
    type: text     # mix structured + unstructured naturally

output_features:
  - name: price
    type: number   # regression — RMSE/MAE metrics out of the box

trainer:
  epochs: 100
  learning_rate: 1.0e-3
  learning_rate_scheduler:
    decay: cosine

─────────────────────────────────────────────
$ pip install ludwig
$ ludwig train --config regression.yaml --dataset housing.csv
$ ludwig visualize --visualization learning_curves --model_path results/

multilabel.yaml — multi-label emotion detection (GoEmotions)

input_features:
  - name: text
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true

output_features:
  - name: labels
    type: set       # set output = multi-label; sigmoid + binary cross-entropy

trainer:
  epochs: 10
  learning_rate: 2.0e-5

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config multilabel.yaml --dataset "ludwig://go_emotions"

multitask.yaml — NER + POS + chunking from one shared encoder

input_features:
  - name: sentence
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true
      reduce_output: null

output_features:
  - name: ner_tags
    type: sequence
    decoder:
      type: tagger
  - name: pos_tags
    type: sequence
    decoder:
      type: tagger
  - name: chunk_tags
    type: sequence
    decoder:
      type: tagger   # 3 tasks, 1 encoder, trained jointly

trainer:
  epochs: 10
  learning_rate: 2.0e-5

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config multitask.yaml --dataset "ludwig://conll2003"

relation.yaml — document-level relation extraction (DocRED)

input_features:
  - name: title
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true
  - name: sents
    type: text
    encoder:
      type: bert
      pretrained_model_name_or_path: answerdotai/ModernBERT-base
      trainable: true

output_features:
  - name: relation
    type: category

combiner:
  type: concat

trainer:
  epochs: 10
  learning_rate: 2.0e-5
  batch_size: 16

─────────────────────────────────────────────
$ pip install ludwig[text]
$ ludwig train --config relation.yaml --dataset "ludwig://docred"

forecasting.yaml — multi-step time series forecasting

input_features:
  - name: sales_history
    type: timeseries
    encoder:
      type: transformer
    preprocessing:
      window_size: 90   # 90 days lookback

  - name: store_id
    type: category

  - name: month
    type: number

output_features:
  - name: sales_forecast
    type: timeseries
    decoder:
      horizon: 30   # predict next 30 days

trainer:
  epochs: 50
  learning_rate: 1.0e-3

─────────────────────────────────────────────
$ pip install ludwig
$ ludwig train --config forecasting.yaml --dataset sales.csv

distributed.yaml — multi-GPU FSDP training on Ray (Qwen3-72B)

model_type: llm
base_model: Qwen/Qwen3-72B

adapter:
  type: lora

input_features:
  - name: prompt
    type: text

output_features:
  - name: response
    type: text

trainer:
  type: finetune
  epochs: 3

backend:
  type: ray              # ← the only change from local training
  trainer:
    use_gpu: true
    num_workers: 4      # 4 × GPU workers
    strategy: fsdp     # or: ddp, deepspeed
    resources_per_worker:
      GPU: 1

─────────────────────────────────────────────
$ pip install "ludwig[llm,distributed]"
$ ludwig train --config distributed.yaml --dataset data.csv

train.py — full Python API

import pandas as pd
from ludwig.api import LudwigModel

config = {
    "input_features": [
        {"name": "text", "type": "text",
         "encoder": {"type": "bert", "pretrained_model_name_or_path": "answerdotai/ModernBERT-base"}}
    ],
    "output_features": [
        {"name": "label", "type": "category"}
    ],
    "trainer": {"epochs": 5},
}

model = LudwigModel(config, logging_level="INFO")
df = pd.read_csv("dataset.csv")

# train, evaluate, predict
results  = model.train(dataset=df)
preds, _ = model.predict(dataset=df)

# one-line AutoML baseline
from ludwig.automl import auto_train
automl = auto_train(
    dataset=df, target="label", time_limit_s=3600
)

# export model
model.save("./my_model")
model.export_savedmodel("./my_model_export")

Examples

50+ ready-to-run examples.

Every example ships with a config file, dataset, and expected results.

LLM

LLM Instruction Tuning

QLoRA fine-tuning of Llama or Qwen on custom instruction datasets.

Alignment

GRPO Alignment

Reward-model-free RL with GRPO — no preference pairs needed.

Vision + LLM

VLM Fine-Tuning

QLoRA fine-tune Qwen2-VL on custom image-question-answer data.

NLP

Named Entity Recognition

Token-level BIO tagging with ModernBERT — CoNLL-2003 and beyond.

NLP

Machine Translation

Seq2seq translation with mBART across 100 language pairs.

NLP

Text Summarization

Abstractive summarization with BART on CNN/DailyMail.

Safety

Content Safety

Classify prompt+response pairs for safety using AEGIS 2.0.

Vision

Semantic Segmentation

Pixel-level classification with ConvNeXtV2 + U-Net decoder.

Multimodal

Visual Question Answering

Fuse ViT + ModernBERT to answer natural-language image questions.

Audio

Speech Recognition

End-to-end ASR with wav2vec2 on LibriSpeech.

Multi-output

Multi-Task Learning

NER + POS + chunking from one shared ModernBERT encoder.

All examples

View all 50+ →

Translation, summarization, speaker verification, forecasting, anomaly detection, and more.

FAQ

Common questions.

Is Ludwig for beginners or experts?

Both. Beginners can use YAML configs and auto_train() without knowing the internals. Experts can plug in custom PyTorch encoders, control activation functions, and tune every hyperparameter. The abstraction is a choice, not a ceiling.

Does Ludwig support LLM fine-tuning?

Yes, comprehensively. Ludwig supports SFT, DPO, KTO, ORPO, and GRPO. Parameter-efficient methods include LoRA, DoRA, VeRA, LoRA+, PiSSA, EVA, CorDA, OFT, HRA, WaveFT, VBLoRA, and more (v0.17). 4-bit QLoRA via torchao means you can fine-tune a 7B model on a single consumer GPU. You can also merge multiple trained adapters using TIES, DARE, or SVD.

How does Ludwig compare to Axolotl or Unsloth?

Axolotl and Unsloth focus on LLM fine-tuning. Ludwig covers the entire ML spectrum: tabular, image, audio, time series, and LLMs — all in the same framework. Ludwig also provides a complete production stack (HPO, distributed training, serving, export) that single-use-case tools don't. If you only ever fine-tune LLMs and need maximum throughput, Unsloth may be faster. If you need a unified framework across tasks and modalities, Ludwig is the right choice.

Can I use my own custom model architectures?

Yes. Ludwig's modular architecture lets you plug in custom PyTorch encoders, decoders, and combiners. Any HuggingFace model can be used as an encoder backbone. You can also subclass Ludwig's base modules for complete control while still benefiting from Ludwig's training infrastructure, HPO, and deployment tooling.

How does Ludwig scale to large datasets?

Ludwig integrates natively with Ray and Dask for distributed data processing. The backend: ray config key enables multi-GPU DDP, FSDP, and DeepSpeed training without any code changes. For datasets too large to fit in memory, Ludwig supports lazy Dask-based preprocessing and efficient Ray Data pipelines.

Is Ludwig production-ready?

Yes. Ludwig ships prebuilt Docker images (CPU, GPU, Ray, Ray+GPU), exports to SafeTensors, ONNX, and torch.export, and includes a FastAPI REST server via ludwig serve. For LLM serving, Ludwig integrates with vLLM for high-throughput inference. You can also upload to HuggingFace Hub with auto-generated model cards.

What data formats does Ludwig support?

CSV, TSV, JSON, Parquet, Feather, HDF5, Pandas DataFrames, and Dask DataFrames. Ludwig also has built-in dataset loading for popular benchmarks via the ludwig:// scheme (e.g. ludwig://alpaca, ludwig://mnist), and can pull datasets from HuggingFace Datasets and Kaggle.

Who maintains Ludwig? Is it actively developed?

Ludwig is an active open-source project hosted by the Linux Foundation AI & Data. It was created by Piero Molino while at Uber and continues to be actively developed and maintained by him alongside a growing open-source community. The latest release (v0.17) added lazy media preprocessing, a prefetch background decoding pipeline, VLM fine-tuning, GRPO alignment, and a preprocessing mode enum with lazy-cached memmap support.

Train anything.One config file.

ML without theengineering overhead.

Everything you need.Nothing you don't.

Declarative Configuration

Multi-Modal & Multi-Task

LLM Fine-Tuning Suite

Scale Without Code Changes

Hyperparameter Optimization

Production-Ready Serving

Rich Experiment Tracking

Fully Extensible

AutoML & Explainability

Every modality. One framework.

Text

Tabular

Image

Audio

Time Series

Geospatial

Vectors

Date / Time

Sequences

Anomalies

Fine-tune any LLM.No GPU expertise needed.

Laptop to cluster.Zero rewrites.

Develop on your machine

Scale to Ray cluster

Deploy on Kubernetes

Works with your stack.

Up and running in minutes.

50+ ready-to-run examples.

LLM Instruction Tuning

GRPO Alignment

VLM Fine-Tuning

Named Entity Recognition

Machine Translation

Text Summarization

Content Safety

Semantic Segmentation

Visual Question Answering

Speech Recognition

Multi-Task Learning

View all 50+ →

Common questions.

Is Ludwig for beginners or experts?

Does Ludwig support LLM fine-tuning?

How does Ludwig compare to Axolotl or Unsloth?

Can I use my own custom model architectures?

How does Ludwig scale to large datasets?

Is Ludwig production-ready?

What data formats does Ludwig support?

Who maintains Ludwig? Is it actively developed?

From data to deployed.Just YAML.

Train anything.
One config file.

ML without the
engineering overhead.

Everything you need.
Nothing you don't.

Fine-tune any LLM.
No GPU expertise needed.

Laptop to cluster.
Zero rewrites.

From data to deployed.
Just YAML.