Binary Classification (Titanic)

This example describes how to use Ludwig to train a model for the kaggle competition, on predicting a passenger's probability of surviving the Titanic disaster. Here's a sample of the data:

Pclass	Sex	Age	SibSp	Fare	Survived	Embarked
3	male	22	1	7.2500	0	S
1	female	38	1	71.2833	1	C
3	female	26	0	7.9250	0	S
3	male	35	0	8.0500	0	S

The full data and the column descriptions can be found here.

After downloading the data, to train a model on this dataset using Ludwig,

ludwig experiment \
  --dataset <PATH_TO_TITANIC_CSV> \
  --config_file config.yaml

With config.yaml:

input_features:
    -
        name: Pclass
        type: category
    -
        name: Sex
        type: category
    -
        name: Age
        type: numerical
        preprocessing:
          missing_value_strategy: fill_with_mean
    -
        name: SibSp
        type: numerical
    -
        name: Parch
        type: numerical
    -
        name: Fare
        type: numerical
        preprocessing:
          missing_value_strategy: fill_with_mean
    -
        name: Embarked
        type: category

output_features:
    -
        name: Survived
        type: binary

Better results can be obtained with morerefined feature transformations and preprocessing, but this example has the only aim to show how this type do tasks and data can be used in Ludwig.