keras.layers.Conv2D() function parameters

Posted by jim35802 on Mon, 10 Jan 2022 22:00:36 +0100

tf.keras.layers.Conv2D() function

Conv2D (2D convolution)

This layer creates a convolution kernel, which is convoluted with the input of this layer to produce an output tensor

When this layer is used as the first layer of the model, the keyword parameter input is provided_ Shape (integer tuple, excluding sample axis, no need to write batch_size)

def __init__(self, filters,
             kernel_size,
             strides=(1, 1),
             padding='valid',
             data_format=None,
             dilation_rate=(1, 1),
             activation=None,
             use_bias=True,
             kernel_initializer='glorot_uniform',
             bias_initializer='zeros',
             kernel_regularizer=None,
             bias_regularizer=None,
             activity_regularizer=None,
             kernel_constraint=None,
             bias_constraint=None,
             **kwargs):

parameter

filters

int type, indicating the number of convolution kernels. filters affect the change of the fourth dimension of the final input result

import tensorflow as tf
from tensorflow.keras.layers import Conv2D

input_shape = (4, 600, 600, 3)
input = tf.random.normal(input_shape)
x = keras.layers.Conv2D(64, (1, 1), strides=(1, 1), name='conv1')(input)
print(x.shape)

OUTPUT:
(4, 600, 600, 64)

kernel_size

Represents the size of the convolution kernel. If it is a square matrix, it can be directly written as a number, which affects the dimensions of the two data in the middle of the output result

x = Conv2D(64, (2, 2), strides=(1, 1), name='conv1')(input)
#or Conv2D(64, 2, strides=(1, 1), name='conv1')(input)
print(x.shape)

OUTPUT:
(4, 599, 599, 64)

strides

tuple (int, int) step size will also affect the middle two dimensions of output. It is worth noting that the data in brackets can be inconsistent, and the abscissa and ordinate are controlled respectively

x = Conv2D(64, 1, strides=(2, 2), name='conv1')(input)
print(x.shape)

OUTPUT:
(4, 300, 300, 64)

padding

Whether to fill the surrounding area, even if it passes through the kernel_size reduces the dimension, but 0 will be filled around to maintain the original dimension; Valid indicates that valid information that is not 0 is stored

a = Conv2D(64, 1, strides=(2, 2), padding="same" , name='conv1')(input)
b = Conv2D(64, 3, strides=(2, 2), padding="same" , name='conv1')(input)
c = Conv2D(64, 3, strides=(1, 1), padding="same" , name='conv1')(input)
d = Conv2D(64, 3, strides=(1, 1), padding="valid", name='conv1')(input)
print(a.shape, b.shape, c.shape, d.shape)

OUTPUT:
(4, 300, 300, 64)
(4, 300, 300, 64)
(4, 600, 600, 64)
(4, 598, 598, 64)

activation

Activate the function, and if activation is not None, it is applied to the output


use_bias

boolean, indicating whether to use the offset. If used_ If bias is true, an offset item is created and added to the output


data_format

Used to specify input_ Format of shape

If it is not filled in, it defaults to channels_last, otherwise you can fill in channels_first. The former will put the input_ The shape triplet is identified as (batch_size, height, width, channels), and the latter will be identified as (batch_size, channels, height, width). However, the sample axis (batch_size) does not need to be filled in by yourself


dilation_rate

int, tuple(int, int), list[int, int], specifies the expansion rate used to expand the convolution. Can be a single integer that specifies the same value for all spatial dimensions. This parameter defines the distance between values when convolution kernel processes data.

Under the same calculation conditions, this parameter provides a larger receptive field. This parameter is often used in real-time image segmentation. When the network layer needs a large receptive field, but the computational resources are limited and the number or size of convolution cores cannot be increased, it can be considered.


Return value

Returns a four-dimensional tensor

The first number is the size of the batch, that is, there are several groups of data; The last three numbers represent the size of a tensor





Sync update on: SP-FA blog

Topics: Python Machine Learning TensorFlow Deep Learning