# Softmax converts a vector of values to a probability distribution.

Source:`R/activations.R`

`activation_softmax.Rd`

The elements of the output vector are in range `[0, 1]`

and sum to 1.

Each input vector is handled independently.
The `axis`

argument sets which axis of the input the function
is applied along.

Softmax is often used as the activation for the last layer of a classification network because the result could be interpreted as a probability distribution.

The softmax of each vector x is computed as
`exp(x) / sum(exp(x))`

.

The input values in are the log-odds of the resulting probability.

## See also

Other activations: `activation_elu()`

`activation_exponential()`

`activation_gelu()`

`activation_hard_sigmoid()`

`activation_leaky_relu()`

`activation_linear()`

`activation_log_softmax()`

`activation_mish()`

`activation_relu()`

`activation_relu6()`

`activation_selu()`

`activation_sigmoid()`

`activation_silu()`

`activation_softplus()`

`activation_softsign()`

`activation_tanh()`