# Add a Loss Function

At a high level, a loss function evaluates how well a model predicts a dataset. Loss functions should always output a scalar. Lower loss corresponds to a better fit, thus the objective of training is to minimize the loss.

Ludwig losses conform to the `torch.nn.Module`

interface, and are declared in `ludwig/modules/loss_modules.py`

. Before
implementing a new loss from scratch, check the documentation of torch.nn loss functions
to see if the desired loss is available. Adding a torch loss to Ludwig is simpler than implementing a loss from scratch.

## Add a torch loss to Ludwig¶

Torch losses whose call signature takes model outputs and targets i.e. `loss(model(input), target)`

can be added to
Ludwig easily by declaring a trivial subclass in `ludwig/modules/loss_modules.py`

and registering the loss for one or
more output feature types. This example adds `MAELoss`

(mean absolute error loss) to Ludwig:

```
@register_loss("mean_absolute_error", [NUMBER, TIMESERIES, VECTOR])
class MAELoss(torch.nn.L1Loss, LogitsInputsMixin):
def __init__(self, **kwargs):
super().__init__()
```

The `@register_loss`

decorator registers the loss under the name `mean_absolute_error`

, and indicates it is supported
for `NUMBER`

, `TIMESERIES`

, and `VECTOR`

output features.

## Implement a loss from scratch¶

### Implement loss function¶

To implement a new loss function, we recommend first implementing it as a function of logits and labels, plus any other
configuration parameters. For this example, lets suppose we have implemented the tempered softmax from
"Robust Bi-Tempered Logistic Loss Based on Bregman Divergences". This loss function
takes two constant parameters `t1`

and `t2`

, which we'd like to allow users to specify in the config.

Assuming we have the following function:

```
def tempered_softmax_cross_entropy_loss(
logits: torch.Tensor,
labels: torch.Tensor,
t1: float, t2: float) -> torch.Tensor:
# Computes the loss, returns the result as a torch.Tensor.
```

### Define and register module¶

Next, we'll define a module class which computes our loss function, and add it to the loss registry for `CATEGORY`

output features with `@register_loss`

. `LogitsInputsMixin`

tells Ludwig that this loss should be called with the output
feature `logits`

, which are the feature decoder outputs before normalization to a probability distribution.

```
@register_loss("tempered_softmax_cross_entropy", [CATEGORY])
class TemperedSoftmaxCrossEntropy(torch.nn.Module, LogitsInputsMixin):
```

Note

It is possible to define losses on other outputs besides `logits`

but this is not used in Ludwig today. For
example, loss could be computed over `probabilities`

, but it is usually more numerically stable to compute from
`logits`

(rather than backpropagating loss through a softmax function).

### constructor¶

The loss constructor will receive any parameters specified in the config as kwargs. It must provide reasonable defaults for all arguments.

```
def __init__(self, t1: float = 1.0, t2: float = 1.0, **kwargs):
super().__init__()
self.t1 = t1
self.t2 = t2
```

### forward¶

The forward method is responsible for computing the loss. Here we'll call the `tempered_softmax_cross_entropy_loss`

after ensuring its inputs are the correct type, and return its output averaged over the batch.

```
def forward(self, logits: torch.Tensor, target: torch.Tensor) -> torch.Tensor:
labels = target.long()
loss = tempered_softmax_cross_entropy_loss(logits, labels, self.t1, self.t2)
return torch.mean(loss)
```