Pads sequences to the same length.

This function transforms a list (of length num_samples) of sequences (lists of integers) into a 2D NumPy array of shape (num_samples, num_timesteps). num_timesteps is either the maxlen argument if provided, or the length of the longest sequence in the list.

Sequences that are shorter than num_timesteps are padded with value until they are num_timesteps long.

Sequences longer than num_timesteps are truncated so that they fit the desired length.

The position where padding or truncation happens is determined by the arguments padding and truncating, respectively. Pre-padding or removing values from the beginning of the sequence is the default.

sequence <- list(c(1), c(2, 3), c(4, 5, 6))
pad_sequences(sequence)

##      [,1] [,2] [,3]
## [1,]    0    0    1
## [2,]    0    2    3
## [3,]    4    5    6

pad_sequences(sequence, value=-1)

##      [,1] [,2] [,3]
## [1,]   -1   -1    1
## [2,]   -1    2    3
## [3,]    4    5    6

pad_sequences(sequence, padding='post')

##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    2    3    0
## [3,]    4    5    6

pad_sequences(sequence, maxlen=2)

##      [,1] [,2]
## [1,]    0    1
## [2,]    2    3
## [3,]    5    6

Usage

pad_sequences(
  sequences,
  maxlen = NULL,
  dtype = "int32",
  padding = "pre",
  truncating = "pre",
  value = 0
)

Arguments

sequences: List of sequences (each sequence is a list of integers).
maxlen: Optional Int, maximum length of all sequences. If not provided, sequences will be padded to the length of the longest individual sequence.
dtype: (Optional, defaults to "int32"). Type of the output sequences. To pad sequences with variable length strings, you can use object.
padding: String, "pre" or "post" (optional, defaults to "pre"): pad either before or after each sequence.
truncating: String, "pre" or "post" (optional, defaults to "pre"): remove values from sequences larger than maxlen, either at the beginning or at the end of the sequences.
value: Float or String, padding value. (Optional, defaults to 0)

Value

Array with shape (len(sequences), maxlen)

Usage

Arguments

Value

See also