Speaker Verification
This example describes how to use Ludwig for a simple speaker verification task. We assume to have the following data with label 0 corresponding to an audio file of an unauthorized voice and label 1 corresponding to an audio file of an authorized voice. The sample data looks as follows:
| audio_path | label | 
|---|---|
| audiodata/audio_000001.wav | 0 | 
| audiodata/audio_000002.wav | 0 | 
| audiodata/audio_000003.wav | 1 | 
| audiodata/audio_000004.wav | 1 | 
ludwig experiment \
--dataset speaker_verification.csv \
  --config config.yaml
With config.yaml:
input_features:
    -
        name: audio_path
        type: audio
        preprocessing:
            audio_file_length_limit_in_s: 7.0
            audio_feature:
                type: stft
                window_length_in_s: 0.04
                window_shift_in_s: 0.02
        encoder: 
            type: cnnrnn
output_features:
    -
        name: label
        type: binary