Binary Classification (Titanic)
This example describes how to use Ludwig to train a model for the kaggle competition, on predicting a passenger's probability of surviving the Titanic disaster. Here's a sample of the data:
Pclass | Sex | Age | SibSp | Parch | Fare | Survived | Embarked |
---|---|---|---|---|---|---|---|
3 | male | 22 | 1 | 0 | 7.2500 | 0 | S |
1 | female | 38 | 1 | 0 | 71.2833 | 1 | C |
3 | female | 26 | 0 | 0 | 7.9250 | 0 | S |
3 | male | 35 | 0 | 0 | 8.0500 | 0 | S |
The full data and the column descriptions can be found here.
After downloading the data, to train a model on this dataset using Ludwig,
ludwig experiment \
--dataset <PATH_TO_TITANIC_CSV> \
--config config.yaml
With config.yaml
:
input_features:
-
name: Pclass
type: category
-
name: Sex
type: category
-
name: Age
type: number
preprocessing:
missing_value_strategy: fill_with_mean
-
name: SibSp
type: number
-
name: Parch
type: number
-
name: Fare
type: number
preprocessing:
missing_value_strategy: fill_with_mean
-
name: Embarked
type: category
output_features:
-
name: Survived
type: binary
Better results can be obtained with more refined feature transformations and preprocessing, but this example has the only aim to show how this type do tasks and data can be used in Ludwig.