Text Classification
This example shows how to build a text classifier with Ludwig. It can be performed using the Reuters-21578 dataset, in particular the version available on CMU's Text Analytics course website. Other datasets available on the same webpage, like OHSUMED, is a well-known medical abstracts dataset, and Epinions.com, a dataset of product reviews, can be used too as the name of the columns is the same.
text | class |
---|---|
Toronto Feb 26 - Standard Trustco said it expects earnings in 1987 to increase at least 15... | earnings |
New York Feb 26 - American Express Co remained silent on market rumors... | acquisition |
BANGKOK March 25 - Vietnam will resettle 300000 people on state farms known as new economic... | coffee |
ludwig experiment \
--dataset text_classification.csv \
--config_file config.yaml
With config.yaml
:
input_features:
-
name: text
type: text
level: word
encoder: parallel_cnn
output_features:
-
name: class
type: category