Based on available runtime hardware and constraints, this layer will choose different implementations (cuDNN-based or backend-native) to maximize the performance. If a GPU is available and all the arguments to the layer meet the requirement of the cuDNN kernel (see below for details), the layer will use a fast cuDNN implementation when using the TensorFlow backend. The requirements to use the cuDNN implementation are:

`activation`

==`tanh`

`recurrent_activation`

==`sigmoid`

`dropout`

== 0 and`recurrent_dropout`

== 0`unroll`

is`FALSE`

`use_bias`

is`TRUE`

Inputs, if use masking, are strictly right-padded.

Eager execution is enabled in the outermost context.

For example:

```
input <- random_uniform(c(32, 10, 8))
output <- input |> layer_lstm(4)
shape(output)
```

```
lstm <- layer_lstm(units = 4, return_sequences = TRUE, return_state = TRUE)
c(whole_seq_output, final_memory_state, final_carry_state) %<-% lstm(input)
shape(whole_seq_output)
```

`shape(final_memory_state)`

`shape(final_carry_state)`

## Usage

```
layer_lstm(
object,
units,
activation = "tanh",
recurrent_activation = "sigmoid",
use_bias = TRUE,
kernel_initializer = "glorot_uniform",
recurrent_initializer = "orthogonal",
bias_initializer = "zeros",
unit_forget_bias = TRUE,
kernel_regularizer = NULL,
recurrent_regularizer = NULL,
bias_regularizer = NULL,
activity_regularizer = NULL,
kernel_constraint = NULL,
recurrent_constraint = NULL,
bias_constraint = NULL,
dropout = 0,
recurrent_dropout = 0,
seed = NULL,
return_sequences = FALSE,
return_state = FALSE,
go_backwards = FALSE,
stateful = FALSE,
unroll = FALSE,
use_cudnn = "auto",
...
)
```

## Arguments

- object
Object to compose the layer with. A tensor, array, or sequential model.

- units
Positive integer, dimensionality of the output space.

- activation
Activation function to use. Default: hyperbolic tangent (

`tanh`

). If you pass`NULL`

, no activation is applied (ie. "linear" activation:`a(x) = x`

).- recurrent_activation
Activation function to use for the recurrent step. Default: sigmoid (

`sigmoid`

). If you pass`NULL`

, no activation is applied (ie. "linear" activation:`a(x) = x`

).- use_bias
Boolean, (default

`TRUE`

), whether the layer should use a bias vector.- kernel_initializer
Initializer for the

`kernel`

weights matrix, used for the linear transformation of the inputs. Default:`"glorot_uniform"`

.- recurrent_initializer
Initializer for the

`recurrent_kernel`

weights matrix, used for the linear transformation of the recurrent state. Default:`"orthogonal"`

.- bias_initializer
Initializer for the bias vector. Default:

`"zeros"`

.- unit_forget_bias
Boolean (default

`TRUE`

). If`TRUE`

, add 1 to the bias of the forget gate at initialization. Setting it to`TRUE`

will also force`bias_initializer="zeros"`

. This is recommended in Jozefowicz et al.- kernel_regularizer
Regularizer function applied to the

`kernel`

weights matrix. Default:`NULL`

.- recurrent_regularizer
Regularizer function applied to the

`recurrent_kernel`

weights matrix. Default:`NULL`

.- bias_regularizer
Regularizer function applied to the bias vector. Default:

`NULL`

.- activity_regularizer
Regularizer function applied to the output of the layer (its "activation"). Default:

`NULL`

.- kernel_constraint
Constraint function applied to the

`kernel`

weights matrix. Default:`NULL`

.- recurrent_constraint
Constraint function applied to the

`recurrent_kernel`

weights matrix. Default:`NULL`

.- bias_constraint
Constraint function applied to the bias vector. Default:

`NULL`

.- dropout
Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs. Default: 0.

- recurrent_dropout
Float between 0 and 1. Fraction of the units to drop for the linear transformation of the recurrent state. Default: 0.

- seed
Random seed for dropout.

- return_sequences
Boolean. Whether to return the last output in the output sequence, or the full sequence. Default:

`FALSE`

.- return_state
Boolean. Whether to return the last state in addition to the output. Default:

`FALSE`

.- go_backwards
Boolean (default:

`FALSE`

). If`TRUE`

, process the input sequence backwards and return the reversed sequence.- stateful
Boolean (default:

`FALSE`

). If`TRUE`

, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.- unroll
Boolean (default

`FALSE`

). If`TRUE`

, the network will be unrolled, else a symbolic loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-intensive. Unrolling is only suitable for short sequences.- use_cudnn
Whether to use a cuDNN-backed implementation.

`"auto"`

will attempt to use cuDNN when feasible, and will fallback to the default implementation if not.- ...
For forward/backward compatability.

## Value

The return value depends on the value provided for the first argument.
If `object`

is:

a

`keras_model_sequential()`

, then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.a

`keras_input()`

, then the output tensor from calling`layer(input)`

is returned.`NULL`

or missing, then a`Layer`

instance is returned.

## Call Arguments

`inputs`

: A 3D tensor, with shape`(batch, timesteps, feature)`

.`mask`

: Binary tensor of shape`(samples, timesteps)`

indicating whether a given timestep should be masked (optional). An individual`TRUE`

entry indicates that the corresponding timestep should be utilized, while a`FALSE`

entry indicates that the corresponding timestep should be ignored. Defaults to`NULL`

.`training`

: Boolean indicating whether the layer should behave in training mode or in inference mode. This argument is passed to the cell when calling it. This is only relevant if`dropout`

or`recurrent_dropout`

is used (optional). Defaults to`NULL`

.`initial_state`

: List of initial state tensors to be passed to the first call of the cell (optional,`NULL`

causes creation of zero-filled initial state tensors). Defaults to`NULL`

.

## See also

Other lstm rnn layers: `rnn_cell_lstm()`

Other rnn layers: `layer_bidirectional()`

`layer_conv_lstm_1d()`

`layer_conv_lstm_2d()`

`layer_conv_lstm_3d()`

`layer_gru()`

`layer_rnn()`

`layer_simple_rnn()`

`layer_time_distributed()`

`rnn_cell_gru()`

`rnn_cell_lstm()`

`rnn_cell_simple()`

`rnn_cells_stack()`

Other layers: `Layer()`

`layer_activation()`

`layer_activation_elu()`

`layer_activation_leaky_relu()`

`layer_activation_parametric_relu()`

`layer_activation_relu()`

`layer_activation_softmax()`

`layer_activity_regularization()`

`layer_add()`

`layer_additive_attention()`

`layer_alpha_dropout()`

`layer_attention()`

`layer_average()`

`layer_average_pooling_1d()`

`layer_average_pooling_2d()`

`layer_average_pooling_3d()`

`layer_batch_normalization()`

`layer_bidirectional()`

`layer_category_encoding()`

`layer_center_crop()`

`layer_concatenate()`

`layer_conv_1d()`

`layer_conv_1d_transpose()`

`layer_conv_2d()`

`layer_conv_2d_transpose()`

`layer_conv_3d()`

`layer_conv_3d_transpose()`

`layer_conv_lstm_1d()`

`layer_conv_lstm_2d()`

`layer_conv_lstm_3d()`

`layer_cropping_1d()`

`layer_cropping_2d()`

`layer_cropping_3d()`

`layer_dense()`

`layer_depthwise_conv_1d()`

`layer_depthwise_conv_2d()`

`layer_discretization()`

`layer_dot()`

`layer_dropout()`

`layer_einsum_dense()`

`layer_embedding()`

`layer_feature_space()`

`layer_flatten()`

`layer_flax_module_wrapper()`

`layer_gaussian_dropout()`

`layer_gaussian_noise()`

`layer_global_average_pooling_1d()`

`layer_global_average_pooling_2d()`

`layer_global_average_pooling_3d()`

`layer_global_max_pooling_1d()`

`layer_global_max_pooling_2d()`

`layer_global_max_pooling_3d()`

`layer_group_normalization()`

`layer_group_query_attention()`

`layer_gru()`

`layer_hashed_crossing()`

`layer_hashing()`

`layer_identity()`

`layer_integer_lookup()`

`layer_jax_model_wrapper()`

`layer_lambda()`

`layer_layer_normalization()`

`layer_masking()`

`layer_max_pooling_1d()`

`layer_max_pooling_2d()`

`layer_max_pooling_3d()`

`layer_maximum()`

`layer_mel_spectrogram()`

`layer_minimum()`

`layer_multi_head_attention()`

`layer_multiply()`

`layer_normalization()`

`layer_permute()`

`layer_random_brightness()`

`layer_random_contrast()`

`layer_random_crop()`

`layer_random_flip()`

`layer_random_rotation()`

`layer_random_translation()`

`layer_random_zoom()`

`layer_repeat_vector()`

`layer_rescaling()`

`layer_reshape()`

`layer_resizing()`

`layer_rnn()`

`layer_separable_conv_1d()`

`layer_separable_conv_2d()`

`layer_simple_rnn()`

`layer_spatial_dropout_1d()`

`layer_spatial_dropout_2d()`

`layer_spatial_dropout_3d()`

`layer_spectral_normalization()`

`layer_string_lookup()`

`layer_subtract()`

`layer_text_vectorization()`

`layer_tfsm()`

`layer_time_distributed()`

`layer_torch_module_wrapper()`

`layer_unit_normalization()`

`layer_upsampling_1d()`

`layer_upsampling_2d()`

`layer_upsampling_3d()`

`layer_zero_padding_1d()`

`layer_zero_padding_2d()`

`layer_zero_padding_3d()`

`rnn_cell_gru()`

`rnn_cell_lstm()`

`rnn_cell_simple()`

`rnn_cells_stack()`