Add a Metric
Metrics are used to report model performance during training and evaluation, and also serve as optimization objectives for hyperparameter optimization.
Concretely, metrics are modules which compute a function of the model's output for each batch and aggregate the
function's result over all batches. A common example of a metric is the LossMetric, which computes the average batch
loss. Metrics are defined in ludwig/modules/metric_modules.py. Ludwig's metrics are designed to be consistent with
torchmetrics and conform to the interface of torchmetrics.Metric.
Note
Before implementing a new metric from scratch, check the
torchmetrics documentation to see if the desired function is
available there. Torch metrics can often be added to Ludwig trivially, see RMSEMetric in
ludwig/modules/metric_modules.py for example.
1. Add a new metric class¶
For the majority of use cases metrics should be averaged over batches, for this Ludwig provides a MeanMetric class
which keeps a running average of its values. The following examples will assume averaging is desired and inherit from
MeanMetric. If you need different aggregation behavior replace MeanMetric with LudwigMetric and accumulate the
metric values as needed.
We'll use TokenAccuracyMetric as an example, which treats each token of a sequence as an independent prediction and
computes average accuracy over sequences.
First, declare the new metric class in ludwig/modules/metric_modules.py:
class TokenAccuracyMetric(MeanMetric):
2. Implement required methods¶
get_current_value¶
If using MeanMetric, compute the value of the metric given a batch of feature outputs and target values in
get_current_value.
def get_current_value(
self, preds: torch.Tensor, target: torch.Tensor) -> torch.Tensor:
# Compute metric over a batch of predictions (preds) and truth values (target).
# Aggregate metric over batch.
return metric_value
Inputs
- preds (torch.Tensor): A batch of outputs from an output feature which are either predictions, probabilities, or logits depending on the return value of get_inputs.
- target (torch.Tensor): The batch of true labels for the dataset column corresponding to the metric's output feature.
Return
- (torch.Tensor): The computed metric, in most cases this will be a scalar value.
update and reset¶
If not using MeanMetric, implement update and reset instead of get_current_value.
def update(self, preds: torch.Tensor, target: torch.Tensor) -> None:
# Compute metric over a batch of predictions (preds) and truth values (target).
# Accumulate metric values or aggregate statistics.
Inputs
- preds (torch.Tensor): A batch of outputs from an output feature which are either predictions, probabilities, or logits depending on the return value of get_inputs.
- target (torch.Tensor): The batch of true labels for the dataset column corresponding to the metric's output feature.
def reset(self) -> None:
# Reset accumulated values.
Note
MeanMetric's update method simply delegates metric computation to get_current_value.
def update(self, preds: torch.Tensor, target: torch.Tensor) -> None:
self.avg.update(self.get_current_value(preds, target))
get_objective¶
The return value of get_objective tells Ludwig whether to minimize or maximize this metric in hyperparameter
optimization.
@classmethod
def get_objective(cls):
return MAXIMIZE
Return
- (str): How this metric should be optimized, one of MINIMIZE or MAXIMIZE.
get_inputs¶
Determines which feature output is passed in to this metric's update or get_current_value method. Valid return
values are:
PREDICTIONS: The predicted values of the output feature.PROBABILITIES: The vector of probabilities.LOGITS: The vector of outputs of the feature decoder's final layer (before the application of any sigmoid or softmax function).
@classmethod
def get_inputs(cls):
return PREDICTIONS
Return
- (str): Which output this metric derives its value from, one of
PREDICTIONS,PROBABILITIES, orLOGITS.
3. Add the new metric class to the registry¶
Mapping between metric names in the config and metric classes is made by registering the class in a metric registry. The
metric registry is defined in ludwig/modules/metric_registry.py. To register your class, add the @register_metric
decorator on the line above its class definition, specifying the name of the metric and a list of the supported output
feature types:
@register_metric(TOKEN_ACCURACY, [SEQUENCE, TEXT])
class TokenAccuracyMetric(MeanMetric):