Losses

These losses, and their documentation, are included from the LossFunctions.jl package.

Pass the function as, e.g., elementwise_loss=L1DistLoss().

You can also declare your own loss as a function that takes two (unweighted) or three (weighted) scalar arguments. For example,

f(x, y, w) = abs(x-y)*w
options = Options(elementwise_loss=f)

Regression

Regression losses work on the distance between targets and predictions: r = x - y.

LossFunctions.LPDistLoss Type

julia

LPDistLoss{P} <: DistanceLoss

The P-th power absolute distance loss. It is Lipschitz continuous iff P == 1, convex if and only if P >= 1, and strictly convex iff P > 1.

L (r) = | r |^{P}

source

LossFunctions.L1DistLoss Type

julia

L1DistLoss <: DistanceLoss

The absolute distance loss. Special case of the LPDistLoss with P=1. It is Lipschitz continuous and convex, but not strictly convex.

L (r) = | r |

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    3 │\.                     ./│    1 │            ┌------------│
      │ '\.                 ./' │      │            |            │
      │   \.               ./   │      │            |            │
      │    '\.           ./'    │      │_           |           _│
    L │      \.         ./      │   L' │            |            │
      │       '\.     ./'       │      │            |            │
      │         \.   ./         │      │            |            │
    0 │          '\./'          │   -1 │------------┘            │
      └────────────┴────────────┘      └────────────┴────────────┘
      -3                        3      -3                        3
                 ŷ - y                            ŷ - y

source

LossFunctions.L2DistLoss Type

julia

L2DistLoss <: DistanceLoss

The least squares loss. Special case of the LPDistLoss with P=2. It is strictly convex.

L (r) = | r |^{2}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    9 │\                       /│    3 │                   .r/   │
      │".                     ."│      │                 .r'     │
      │ ".                   ." │      │              _./'       │
      │  ".                 ."  │      │_           .r/         _│
    L │   ".               ."   │   L' │         _:/'            │
      │    '\.           ./'    │      │       .r'               │
      │      \.         ./      │      │     .r'                 │
    0 │        "-.___.-"        │   -3 │  _/r'                   │
      └────────────┴────────────┘      └────────────┴────────────┘
      -3                        3      -2                        2
                 ŷ - y                            ŷ - y

source

LossFunctions.PeriodicLoss Type

julia

PeriodicLoss <: DistanceLoss

Measures distance on a circle of specified circumference c.

L (r) = 1 - \cos (\frac{2 r π}{c})

source

LossFunctions.HuberLoss Type

julia

HuberLoss <: DistanceLoss

Loss function commonly used for robustness to outliers. For large values of d it becomes close to the L1DistLoss, while for small values of d it resembles the L2DistLoss. It is Lipschitz continuous and convex, but not strictly convex.

L (r) = {\begin{cases} \frac{r^{2}}{2} & if | r | \leq α \\ α | r | - \frac{α^{3}}{2} & otherwise \end{cases}

              Lossfunction (d=1)               Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │                         │    1 │                .+-------│
      │                         │      │              ./'        │
      │\.                     ./│      │             ./          │
      │ '.                   .' │      │_           ./          _│
    L │   \.               ./   │   L' │           /'            │
      │     \.           ./     │      │          /'             │
      │      '.         .'      │      │        ./'              │
    0 │        '-.___.-'        │   -1 │-------+'                │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 ŷ - y                            ŷ - y

source

LossFunctions.L1EpsilonInsLoss Type

julia

L1EpsilonInsLoss <: DistanceLoss

The $ϵ$ -insensitive loss. Typically used in linear support vector regression. It ignores deviances smaller than $ϵ$ , but penalizes larger deviances linarily. It is Lipschitz continuous and convex, but not strictly convex.

L (r) = max {0, | r | - ϵ}

              Lossfunction (ϵ=1)               Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │\                       /│    1 │                  ┌------│
      │ \                     / │      │                  |      │
      │  \                   /  │      │                  |      │
      │   \                 /   │      │_      ___________!     _│
    L │    \               /    │   L' │      |                  │
      │     \             /     │      │      |                  │
      │      \           /      │      │      |                  │
    0 │       \_________/       │   -1 │------┘                  │
      └────────────┴────────────┘      └────────────┴────────────┘
      -3                        3      -2                        2
                 ŷ - y                            ŷ - y

source

LossFunctions.L2EpsilonInsLoss Type

julia

L2EpsilonInsLoss <: DistanceLoss

The quadratic $ϵ$ -insensitive loss. Typically used in linear support vector regression. It ignores deviances smaller than $ϵ$ , but penalizes larger deviances quadratically. It is convex, but not strictly convex.

L (r) = max {0, | r | - ϵ}^{2}

              Lossfunction (ϵ=0.5)             Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    8 │                         │    1 │                  /      │
      │:                       :│      │                 /       │
      │'.                     .'│      │                /        │
      │ \.                   ./ │      │_         _____/        _│
    L │  \.                 ./  │   L' │         /               │
      │   \.               ./   │      │        /                │
      │    '\.           ./'    │      │       /                 │
    0 │      '-._______.-'      │   -1 │      /                  │
      └────────────┴────────────┘      └────────────┴────────────┘
      -3                        3      -2                        2
                 ŷ - y                            ŷ - y

source

LossFunctions.LogitDistLoss Type

julia

LogitDistLoss <: DistanceLoss

The distance-based logistic loss for regression. It is strictly convex and Lipschitz continuous.

L (r) = - \ln \frac{4 e^{r}}{(1 + e^{r})^{2}}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │                         │    1 │                   _--'''│
      │\                       /│      │                ./'      │
      │ \.                   ./ │      │              ./         │
      │  '.                 .'  │      │_           ./          _│
    L │   '.               .'   │   L' │           ./            │
      │     \.           ./     │      │         ./              │
      │      '.         .'      │      │       ./                │
    0 │        '-.___.-'        │   -1 │___.-''                  │
      └────────────┴────────────┘      └────────────┴────────────┘
      -3                        3      -4                        4
                 ŷ - y                            ŷ - y

source

LossFunctions.QuantileLoss Type

julia

QuantileLoss <: DistanceLoss

The distance-based quantile loss, also known as pinball loss, can be used to estimate conditional τ-quantiles. It is Lipschitz continuous and convex, but not strictly convex. Furthermore it is symmetric if and only if τ = 1/2.

L (r) = {\begin{cases} - (1 - τ) r & if r < 0 \\ τ r & if r \geq 0 \end{cases}

              Lossfunction (τ=0.7)             Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │'\                       │  0.3 │            ┌------------│
      │  \.                     │      │            |            │
      │   '\                    │      │_           |           _│
      │     \.                  │      │            |            │
    L │      '\              ._-│   L' │            |            │
      │        \.         ..-'  │      │            |            │
      │         '.     _r/'     │      │            |            │
    0 │           '_./'         │ -0.7 │------------┘            │
      └────────────┴────────────┘      └────────────┴────────────┘
      -3                        3      -3                        3
                 ŷ - y                            ŷ - y

source

Classification

Classifications losses (assuming binary) work on the margin between targets and predictions: r = x y, assuming the target y is either -1 or +1.

LossFunctions.ZeroOneLoss Type

julia

ZeroOneLoss <: MarginLoss

The classical classification loss. It penalizes every misclassified observation with a loss of 1 while every correctly classified observation has a loss of 0. It is not convex nor continuous and thus seldom used directly. Instead one usually works with some classification-calibrated surrogate loss, such as L1HingeLoss.

L (a) = {\begin{cases} 1 & if a < 0 \\ 0 & if a >= 0 \end{cases}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    1 │------------┐            │    1 │                         │
      │            |            │      │                         │
      │            |            │      │                         │
      │            |            │      │_________________________│
      │            |            │      │                         │
      │            |            │      │                         │
      │            |            │      │                         │
    0 │            └------------│   -1 │                         │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                y * h(x)                         y * h(x)

source

LossFunctions.PerceptronLoss Type

julia

PerceptronLoss <: MarginLoss

The perceptron loss linearly penalizes every prediction where the resulting agreement <= 0. It is Lipschitz continuous and convex, but not strictly convex.

L (a) = max {0, - a}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │\.                       │    0 │            ┌------------│
      │ '..                     │      │            |            │
      │   \.                    │      │            |            │
      │     '.                  │      │            |            │
    L │      '.                 │   L' │            |            │
      │        \.               │      │            |            │
      │         '.              │      │            |            │
    0 │           \.____________│   -1 │------------┘            │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.LogitMarginLoss Type

julia

LogitMarginLoss <: MarginLoss

The margin version of the logistic loss. It is infinitely many times differentiable, strictly convex, and Lipschitz continuous.

L (a) = \ln (1 + e^{- a})

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │ \.                      │    0 │                  ._--/""│
      │   \.                    │      │               ../'      │
      │     \.                  │      │              ./         │
      │       \..               │      │            ./'          │
    L │         '-_             │   L' │          .,'            │
      │            '-_          │      │         ./              │
      │               '\-._     │      │      .,/'               │
    0 │                    '""*-│   -1 │__.--''                  │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -4                        4
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.L1HingeLoss Type

julia

L1HingeLoss <: MarginLoss

The hinge loss linearly penalizes every predicition where the resulting agreement < 1 . It is Lipschitz continuous and convex, but not strictly convex.

L (a) = max {0, 1 - a}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    3 │'\.                      │    0 │                  ┌------│
      │  ''_                    │      │                  |      │
      │     \.                  │      │                  |      │
      │       '.                │      │                  |      │
    L │         ''_             │   L' │                  |      │
      │            \.           │      │                  |      │
      │              '.         │      │                  |      │
    0 │                ''_______│   -1 │------------------┘      │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.L2HingeLoss Type

julia

L2HingeLoss <: MarginLoss

The truncated least squares loss quadratically penalizes every predicition where the resulting agreement < 1. It is locally Lipschitz continuous and convex, but not strictly convex.

L (a) = max {0, 1 - a}^{2}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    5 │     .                   │    0 │                 ,r------│
      │     '.                  │      │               ,/        │
      │      '\                 │      │             ,/          │
      │        \                │      │           ,/            │
    L │         '.              │   L' │         ./              │
      │          '.             │      │       ./                │
      │            \.           │      │     ./                  │
    0 │              '-.________│   -5 │   ./                    │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.SmoothedL1HingeLoss Type

julia

SmoothedL1HingeLoss <: MarginLoss

As the name suggests a smoothed version of the L1 hinge loss. It is Lipschitz continuous and convex, but not strictly convex.

L (a) = {\begin{cases} \frac{0.5}{γ} \cdot max {0, 1 - a}^{2} & if a \geq 1 - γ \\ 1 - \frac{γ}{2} - a & otherwise \end{cases}

              Lossfunction (γ=2)               Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │\.                       │    0 │                 ,r------│
      │ '.                      │      │               ./'       │
      │   \.                    │      │              ,/         │
      │     '.                  │      │            ./'          │
    L │      '.                 │   L' │           ,'            │
      │        \.               │      │         ,/              │
      │          ',             │      │       ./'               │
    0 │            '*-._________│   -1 │______./                 │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.ModifiedHuberLoss Type

julia

ModifiedHuberLoss <: MarginLoss

A special (4 times scaled) case of the SmoothedL1HingeLoss with γ=2. It is Lipschitz continuous and convex, but not strictly convex.

L (a) = {\begin{cases} max {0, 1 - a}^{2} & if a \geq - 1 \\ - 4 a & otherwise \end{cases}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    5 │    '.                   │    0 │                .+-------│
      │     '.                  │      │              ./'        │
      │      '\                 │      │             ,/          │
      │        \                │      │           ,/            │
    L │         '.              │   L' │         ./              │
      │          '.             │      │       ./'               │
      │            \.           │      │______/'                 │
    0 │              '-.________│   -5 │                         │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.L2MarginLoss Type

julia

L2MarginLoss <: MarginLoss

The margin-based least-squares loss for classification, which penalizes every prediction where agreement != 1 quadratically. It is locally Lipschitz continuous and strongly convex.

L (a) = {(1 - a)}^{2}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    5 │     .                   │    2 │                       ,r│
      │     '.                  │      │                     ,/  │
      │      '\                 │      │                   ,/    │
      │        \                │      ├                 ,/      ┤
    L │         '.              │   L' │               ./        │
      │          '.             │      │             ./          │
      │            \.          .│      │           ./            │
    0 │              '-.____.-' │   -3 │         ./              │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.ExpLoss Type

julia

ExpLoss <: MarginLoss

The margin-based exponential loss for classification, which penalizes every prediction exponentially. It is infinitely many times differentiable, locally Lipschitz continuous and strictly convex, but not clipable.

L (a) = e^{- a}

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    5 │  \.                     │    0 │               _,,---:'""│
      │   l                     │      │           _r/"'         │
      │    l.                   │      │        .r/'             │
      │     ":                  │      │      .r'                │
    L │       \.                │   L' │     ./                  │
      │        "\..             │      │    .'                   │
      │           '":,_         │      │   ,'                    │
    0 │                ""---:.__│   -5 │  ./                     │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.SigmoidLoss Type

julia

SigmoidLoss <: MarginLoss

Continuous loss which penalizes every prediction with a loss within in the range (0,2). It is infinitely many times differentiable, Lipschitz continuous but nonconvex.

L (a) = 1 - \tanh (a)

              Lossfunction                     Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │""'--,.                  │    0 │..                     ..│
      │      '\.                │      │ "\.                 ./" │
      │         '.              │      │    ',             ,'    │
      │           \.            │      │      \           /      │
    L │            "\.          │   L' │       \         /       │
      │              \.         │      │        \.     ./        │
      │                \,       │      │         \.   ./         │
    0 │                  '"-:.__│   -1 │          ',_,'          │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

LossFunctions.DWDMarginLoss Type

julia

DWDMarginLoss <: MarginLoss

The distance weighted discrimination margin loss. It is a differentiable generalization of the L1HingeLoss that is different than the SmoothedL1HingeLoss. It is Lipschitz continuous and convex, but not strictly convex.

L (a) = {\begin{cases} 1 - a & if a \leq \frac{q}{q + 1} \\ \frac{1}{a^{q}} \frac{q^{q}}{(q + 1)^{q + 1}} & otherwise \end{cases}

              Lossfunction (q=1)               Derivative
      ┌────────────┬────────────┐      ┌────────────┬────────────┐
    2 │      ".                 │    0 │                     ._r-│
      │        \.               │      │                   ./    │
      │         ',              │      │                 ./      │
      │           \.            │      │                 /       │
    L │            "\.          │   L' │                .        │
      │              \.         │      │                /        │
      │               ":__      │      │               ;         │
    0 │                   '""---│   -1 │---------------┘         │
      └────────────┴────────────┘      └────────────┴────────────┘
      -2                        2      -2                        2
                 y ⋅ ŷ                            y ⋅ ŷ

source

Losses ​

Regression ​

Classification ​

Losses

Regression

Classification