# nn.AvgPool2d -- two dimensional average pooling operation

Posted by jsnyder2k on Tue, 28 Sep 2021 20:52:24 +0200

# PyTorch learning notes: nn.AvgPool2d - two dimensional average pooling operation

torch.nn.AvgPool2d( kernel_size , stride=None , padding=0 , ceil_mode=False , count_include_pad=True , divisor_override=None )


Function: 2D average pooling operation is applied to the input signal composed of multiple planes. The specific calculation formula is as follows:
o u t ( N i , C i , h , w ) = 1 k H ∗ k W ∑ m = 0 k H − 1 ∑ m = 0 k H − 1 i n p u t ( N i , C i , s t r i d e [ 0 ] × h + m , s t r i d e [ 1 ] × w + n ) false set up transport enter ruler inch yes ( N , C , H , W ) , transport Out ruler inch yes ( N , C , H o u t , W o u t ) , pool turn nucleus ruler inch yes ( k H , k W ) out(N_i,C_i,h,w)=\frac{1}{kH*kW}\sum^{kH-1}_ {m=0}\sum^{kH-1}_ {M = 0} input (n_i, c_i, stripe  \ times H + m, stripe  \ times W + n) \ \ suppose the input size is (N,C,H,W), the output size is (n, C, H {out}, w {out}), and the pool core size is (kH,kW) out(Ni​,Ci​,h,w)=kH∗kW1​m=0∑kH−1​m=0∑kH−1​input(Ni​,Ci​,stride × h+m,stride × w+n) assume that the input size is (N,C,H,W), the output size is (N,C,Hout, Wout), and the pool core size is (kH,kW)
If padding is non-zero, 0 will be implicitly filled around the input image. You can specify the parameter count_include_pad to determine whether the 0 is included in the pool calculation process.

Input:

• kernel_size: the size of the pooled core
• Stripe: the moving stride of the window, which is the same as the kernel by default_ Consistent size
• ceil_mode: when set to True, the operation of rounding up is adopted in the process of calculating the output shape; otherwise, the operation of rounding down is adopted
• count_include_pad: Boolean type. When True, zero padding will be included in the average pooling calculation; otherwise, zero padding will not be included
• divisor_override: if specified, the divisor will be replaced by the divisor_override. In other words, if this variable is not specified, the calculation process of the average pool is actually in a pool core, adding the elements and dividing them by the size of the pool core, that is, the divisor_override defaults to the high of the pooled core × Wide; If this variable is specified, the pooling process is to add the elements in the pooled core and divide by the division_ override.

be careful:

• Kernel of parameter_ Size, stripe and padding can be:

• Integer, in which case the height and width dimensions are the same
• Tuple, containing two integers, the first for the height dimension and the second for the width dimension
• The calculation formula of output shape is:
H o u t = ⌊ H i n + 2 × p a d d i n g [ 0 ] − k e r n e l _ s i z e [ 0 ] s t r i d e [ 0 ] ⌋ W o u t = ⌊ W i n + 2 × p a d d i n g [ 1 ] − k e r n e l _ s i z e [ 1 ] s t r i d e [ 1 ] ⌋ his in ， H i n and W i n by transport enter of high and wide ， Silence recognize towards lower take whole ( can finger set ginseng number come repair change take whole gauge be ) H_ {out}=\lfloor{\frac{H_{in}+2\times padding-kernel\_size}{stride}}\rfloor\ W_ {out} = \ lfloor {\ frac {w {in} + 2 \ times padding  - kernel \ _size } {stripe }} \ rfloor \ \ where h_ {in} and W_{in} is the entered height and width, rounded down by default (you can specify parameters to modify the rounding rules) Hout​=⌊strideHin​+2 × padding−kernel_size​⌋Wout​=⌊strideWin​+2 × padding−kernel_size ⌋ where Hin ⌋ and win ⌋ are the entered height and width, rounded down by default (you can specify parameters to modify the rounding rules)

• The padding size should be smaller than the pool core size

# Code case

General usage

import torch
from torch import nn
img=torch.arange(16).reshape(1,1,4,4)
# The pool core and pool step are both 2
pool=nn.AvgPool2d(2,stride=2)
img_2=pool(img)
print(img)
print(img_2)


output

# Original image
tensor([[[[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15]]]])
# The length and width of the pooled image are half of the original
tensor([[[[ 2,  4],
[10, 12]]]])


ceil_ The difference between setting mode to True and Fasle

import torch
from torch import nn
img=torch.arange(20,dtype=torch.float).reshape(1,1,4,5)
img_2=pool_f(img)
img_3=pool_t(img)
print(img)
print(img_2)
print(img_3)


output

# Original image
tensor([[[[ 0.,  1.,  2.,  3.,  4.],
[ 5.,  6.,  7.,  8.,  9.],
[10., 11., 12., 13., 14.],
[15., 16., 17., 18., 19.]]]])
# By default, ceil_mode is False
tensor([[[[ 3.,  5.],
[13., 15.]]]])
# ceil_mode is True
tensor([[[[ 3.0000,  5.0000,  6.5000],
[13.0000, 15.0000, 16.5000]]]])
# Since 5 cannot be divided by 2, one is rounded down and the other is rounded up


The difference between setting and not setting padding

import torch
from torch import nn
img=torch.arange(16,dtype=torch.float).reshape(1,1,4,4)
pool_f=nn.AvgPool2d(2,stride=2)
img_2=pool_t(img)
img_3=pool_f(img)
print(img)
print(img_2)
print(img_3)


output

# Original drawing
tensor([[[[ 0.,  1.,  2.,  3.],
[ 4.,  5.,  6.,  7.],
[ 8.,  9., 10., 11.],
[12., 13., 14., 15.]]]])
# The filling width is 1, and the 0 filled by default will be used for pooling calculation
tensor([[[[0.0000, 0.7500, 0.7500],
[3.0000, 7.5000, 4.5000],
[3.0000, 6.7500, 3.7500]]]])
# Unfilled results
tensor([[[[ 2.5000,  4.5000],
[10.5000, 12.5000]]]])
# The image size obtained by pooling the filled image can be calculated by the above formula


count_ include_ The difference between setting pad to True and False

import torch
from torch import nn
img=torch.arange(16,dtype=torch.float).reshape(1,1,4,4)
img_2=pool_t(img)
img_3=pool_f(img)
print(img)
print(img_2)
print(img_3)


output

# Original drawing
tensor([[[[ 0.,  1.,  2.,  3.],
[ 4.,  5.,  6.,  7.],
[ 8.,  9., 10., 11.],
[12., 13., 14., 15.]]]])
# Fill width is 1, count_include_pad is true by default
# Populated 0 is used for pooled calculation
tensor([[[[0.0000, 0.7500, 0.7500],
[3.0000, 7.5000, 4.5000],
[3.0000, 6.7500, 3.7500]]]])
# Populated 0 is not used for pooled calculations
tensor([[[[ 0.0000,  1.5000,  3.0000],
[ 6.0000,  7.5000,  9.0000],
[12.0000, 13.5000, 15.0000]]]])


divisor_ Difference between override set and unset

import torch
from torch import nn
img=torch.arange(16,dtype=torch.float).reshape(1,1,4,4)
pool_1=nn.AvgPool2d(2,stride=2)
pool_d1=nn.AvgPool2d(2,stride=2,divisor_override=2)
pool_d2=nn.AvgPool2d(2,stride=2,divisor_override=3)
img_1=pool_1(img)
img_2=pool_d1(img)
img_3=pool_d2(img)
print(img)
print(img_1)
print(img_2)
print(img_3)


output

# Original image
tensor([[[[ 0.,  1.,  2.,  3.],
[ 4.,  5.,  6.,  7.],
[ 8.,  9., 10., 11.],
[12., 13., 14., 15.]]]])
# The director is not set_ Override, which is a normal average pooling operation
tensor([[[[ 2.5000,  4.5000],
[10.5000, 12.5000]]]])
# divisor_override is set to 2. Take the four elements in the upper left corner as an example
# After pooling, the first element is the sum of the four elements in the upper left corner of the original figure divided by 2
tensor([[[[ 5.,  9.],
[21., 25.]]]])
# divisor_override is set to 3,
# That is, the sum of the four elements divided by 3
tensor([[[[ 3.3333,  6.0000],
[14.0000, 16.6667]]]])


# Official documents

Topics: Pytorch Deep Learning