Visual Question Answering

image_path question answer
imdata/image_000001.jpg Is there snow on the mountains? yes
imdata/image_000002.jpg What color are the wheels blue
imdata/image_000003.jpg What kind of utensil is in the glass bowl knife
ludwig experiment \
--dataset vqa.csv \
  --config_file config.yaml

With config.yaml:

        name: image_path
        type: image
        encoder: stacked_cnn
        name: question
        type: text
        level: word
        encoder: parallel_cnn

        name: answer
        type: text
        level: word
        decoder: generator
        cell_type: lstm
            type: sampled_softmax_cross_entropy