Max pooling operation for 2D spatial data.

Downsamples the input along its spatial dimensions (height and width) by taking the maximum value over an input window (of size defined by pool_size) for each channel of the input. The window is shifted by strides along each dimension.

The resulting output when using the "valid" padding option has a spatial shape (number of rows or columns) of: output_shape = floor((input_shape - pool_size) / strides) + 1 (when input_shape >= pool_size)

The resulting output shape when using the "same" padding option is: output_shape = floor((input_shape - 1) / strides) + 1

Usage

layer_max_pooling_2d(
  object,
  pool_size = list(2L, 2L),
  strides = NULL,
  padding = "valid",
  data_format = NULL,
  name = NULL,
  ...
)

Arguments

object: Object to compose the layer with. A tensor, array, or sequential model.
pool_size: int or list of 2 integers, factors by which to downscale (dim1, dim2). If only one integer is specified, the same window length will be used for all dimensions.
strides: int or list of 2 integers, or NULL. Strides values. If NULL, it will default to pool_size. If only one int is specified, the same stride size will be used for all dimensions.
padding: string, either "valid" or "same" (case-insensitive). "valid" means no padding. "same" results in padding evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.
data_format: string, either "channels_last" or "channels_first". The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape (batch, height, width, channels) while "channels_first" corresponds to inputs with shape (batch, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".
name: String, name for the object
...: For forward/backward compatability.

Value

The return value depends on the value provided for the first argument. If object is:

a keras_model_sequential(), then the layer is added to the sequential model (which is modified in place). To enable piping, the sequential model is also returned, invisibly.
a keras_input(), then the output tensor from calling layer(input) is returned.
NULL or missing, then a Layer instance is returned.

Input Shape

If data_format="channels_last": 4D tensor with shape (batch_size, height, width, channels).
If data_format="channels_first": 4D tensor with shape (batch_size, channels, height, width).

Output Shape

If data_format="channels_last": 4D tensor with shape (batch_size, pooled_height, pooled_width, channels).
If data_format="channels_first": 4D tensor with shape (batch_size, channels, pooled_height, pooled_width).

Examples

strides = (1, 1) and padding = "valid":

x <- rbind(c(1., 2., 3.),
           c(4., 5., 6.),
           c(7., 8., 9.)) |> op_reshape(c(1, 3, 3, 1))
max_pool_2d <- layer_max_pooling_2d(pool_size = c(2, 2),
                                    strides = c(1, 1),
                                    padding = "valid")
max_pool_2d(x)

## tf.Tensor(
## [[[[5.]
##    [6.]]
##
##   [[8.]
##    [9.]]]], shape=(1, 2, 2, 1), dtype=float32)

strides = c(2, 2) and padding = "valid":

x <- rbind(c(1., 2., 3., 4.),
           c(5., 6., 7., 8.),
           c(9., 10., 11., 12.)) |> op_reshape(c(1, 3, 4, 1))
max_pool_2d <- layer_max_pooling_2d(pool_size = c(2, 2),
                                    strides = c(2, 2),
                                    padding = "valid")
max_pool_2d(x)

## tf.Tensor(
## [[[[6.]
##    [8.]]]], shape=(1, 1, 2, 1), dtype=float32)

stride = (1, 1) and padding = "same":

x <- rbind(c(1., 2., 3.),
           c(4., 5., 6.),
           c(7., 8., 9.)) |> op_reshape(c(1, 3, 3, 1))
max_pool_2d <- layer_max_pooling_2d(pool_size = c(2, 2),
                                    strides = c(1, 1),
                                    padding = "same")
max_pool_2d(x)

## tf.Tensor(
## [[[[5.]
##    [6.]
##    [6.]]
##
##   [[8.]
##    [9.]
##    [9.]]
##
##   [[8.]
##    [9.]
##    [9.]]]], shape=(1, 3, 3, 1), dtype=float32)

Other pooling layers:
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_max_pooling_1d()
layer_max_pooling_3d()

Other layers:
Layer()
layer_activation()
layer_activation_elu()
layer_activation_leaky_relu()
layer_activation_parametric_relu()
layer_activation_relu()
layer_activation_softmax()
layer_activity_regularization()
layer_add()
layer_additive_attention()
layer_alpha_dropout()
layer_attention()
layer_aug_mix()
layer_auto_contrast()
layer_average()
layer_average_pooling_1d()
layer_average_pooling_2d()
layer_average_pooling_3d()
layer_batch_normalization()
layer_bidirectional()
layer_category_encoding()
layer_center_crop()
layer_concatenate()
layer_conv_1d()
layer_conv_1d_transpose()
layer_conv_2d()
layer_conv_2d_transpose()
layer_conv_3d()
layer_conv_3d_transpose()
layer_conv_lstm_1d()
layer_conv_lstm_2d()
layer_conv_lstm_3d()
layer_cropping_1d()
layer_cropping_2d()
layer_cropping_3d()
layer_cut_mix()
layer_dense()
layer_depthwise_conv_1d()
layer_depthwise_conv_2d()
layer_discretization()
layer_dot()
layer_dropout()
layer_einsum_dense()
layer_embedding()
layer_equalization()
layer_feature_space()
layer_flatten()
layer_flax_module_wrapper()
layer_gaussian_dropout()
layer_gaussian_noise()
layer_global_average_pooling_1d()
layer_global_average_pooling_2d()
layer_global_average_pooling_3d()
layer_global_max_pooling_1d()
layer_global_max_pooling_2d()
layer_global_max_pooling_3d()
layer_group_normalization()
layer_group_query_attention()
layer_gru()
layer_hashed_crossing()
layer_hashing()
layer_identity()
layer_integer_lookup()
layer_jax_model_wrapper()
layer_lambda()
layer_layer_normalization()
layer_lstm()
layer_masking()
layer_max_num_bounding_boxes()
layer_max_pooling_1d()
layer_max_pooling_3d()
layer_maximum()
layer_mel_spectrogram()
layer_minimum()
layer_mix_up()
layer_multi_head_attention()
layer_multiply()
layer_normalization()
layer_permute()
layer_rand_augment()
layer_random_brightness()
layer_random_color_degeneration()
layer_random_color_jitter()
layer_random_contrast()
layer_random_crop()
layer_random_elastic_transform()
layer_random_erasing()
layer_random_flip()
layer_random_gaussian_blur()
layer_random_grayscale()
layer_random_hue()
layer_random_invert()
layer_random_perspective()
layer_random_posterization()
layer_random_rotation()
layer_random_saturation()
layer_random_sharpness()
layer_random_shear()
layer_random_translation()
layer_random_zoom()
layer_repeat_vector()
layer_rescaling()
layer_reshape()
layer_resizing()
layer_rms_normalization()
layer_rnn()
layer_separable_conv_1d()
layer_separable_conv_2d()
layer_simple_rnn()
layer_solarization()
layer_spatial_dropout_1d()
layer_spatial_dropout_2d()
layer_spatial_dropout_3d()
layer_spectral_normalization()
layer_stft_spectrogram()
layer_string_lookup()
layer_subtract()
layer_text_vectorization()
layer_tfsm()
layer_time_distributed()
layer_torch_module_wrapper()
layer_unit_normalization()
layer_upsampling_1d()
layer_upsampling_2d()
layer_upsampling_3d()
layer_zero_padding_1d()
layer_zero_padding_2d()
layer_zero_padding_3d()
rnn_cell_gru()
rnn_cell_lstm()
rnn_cell_simple()
rnn_cells_stack()