PyTorch learning notes: nn.AvgPool2d - two dimensional average pooling operation
torch.nn.AvgPool2d( kernel_size , stride=None , padding=0 , ceil_mode=False , count_include_pad=True , divisor_override=None )
Function: 2D average pooling operation is applied to the input signal composed of multiple planes. The specific calculation formula is as follows:
o
u
t
(
N
i
,
C
i
,
h
,
w
)
=
1
k
H
∗
k
W
∑
m
=
0
k
H
−
1
∑
m
=
0
k
H
−
1
i
n
p
u
t
(
N
i
,
C
i
,
s
t
r
i
d
e
[
0
]
×
h
+
m
,
s
t
r
i
d
e
[
1
]
×
w
+
n
)
false
set up
transport
enter
ruler
inch
yes
(
N
,
C
,
H
,
W
)
,
transport
Out
ruler
inch
yes
(
N
,
C
,
H
o
u
t
,
W
o
u
t
)
,
pool
turn
nucleus
ruler
inch
yes
(
k
H
,
k
W
)
out(N_i,C_i,h,w)=\frac{1}{kH*kW}\sum^{kH-1}_ {m=0}\sum^{kH-1}_ {M = 0} input (n_i, c_i, stripe [0] \ times H + m, stripe [1] \ times W + n) \ \ suppose the input size is (N,C,H,W), the output size is (n, C, H {out}, w {out}), and the pool core size is (kH,kW)
out(Ni,Ci,h,w)=kH∗kW1m=0∑kH−1m=0∑kH−1input(Ni,Ci,stride[0] × h+m,stride[1] × w+n) assume that the input size is (N,C,H,W), the output size is (N,C,Hout, Wout), and the pool core size is (kH,kW)
If padding is non-zero, 0 will be implicitly filled around the input image. You can specify the parameter count_include_pad to determine whether the 0 is included in the pool calculation process.
Input:
- kernel_size: the size of the pooled core
- Stripe: the moving stride of the window, which is the same as the kernel by default_ Consistent size
- Padding: zero padding width size on both sides
- ceil_mode: when set to True, the operation of rounding up is adopted in the process of calculating the output shape; otherwise, the operation of rounding down is adopted
- count_include_pad: Boolean type. When True, zero padding will be included in the average pooling calculation; otherwise, zero padding will not be included
- divisor_override: if specified, the divisor will be replaced by the divisor_override. In other words, if this variable is not specified, the calculation process of the average pool is actually in a pool core, adding the elements and dividing them by the size of the pool core, that is, the divisor_override defaults to the high of the pooled core × Wide; If this variable is specified, the pooling process is to add the elements in the pooled core and divide by the division_ override.
be careful:
-
Kernel of parameter_ Size, stripe and padding can be:
- Integer, in which case the height and width dimensions are the same
- Tuple, containing two integers, the first for the height dimension and the second for the width dimension
-
The calculation formula of output shape is:
H o u t = ⌊ H i n + 2 × p a d d i n g [ 0 ] − k e r n e l _ s i z e [ 0 ] s t r i d e [ 0 ] ⌋ W o u t = ⌊ W i n + 2 × p a d d i n g [ 1 ] − k e r n e l _ s i z e [ 1 ] s t r i d e [ 1 ] ⌋ his in , H i n and W i n by transport enter of high and wide , Silence recognize towards lower take whole ( can finger set ginseng number come repair change take whole gauge be ) H_ {out}=\lfloor{\frac{H_{in}+2\times padding[0]-kernel\_size[0]}{stride[0]}}\rfloor\ W_ {out} = \ lfloor {\ frac {w {in} + 2 \ times padding [1] - kernel \ _size [1]} {stripe [1]}} \ rfloor \ \ where h_ {in} and W_{in} is the entered height and width, rounded down by default (you can specify parameters to modify the rounding rules) Hout=⌊stride[0]Hin+2 × padding[0]−kernel_size[0]⌋Wout=⌊stride[1]Win+2 × padding[1]−kernel_size[1] ⌋ where Hin ⌋ and win ⌋ are the entered height and width, rounded down by default (you can specify parameters to modify the rounding rules) -
The padding size should be smaller than the pool core size
Code case
General usage
import torch from torch import nn img=torch.arange(16).reshape(1,1,4,4) # The pool core and pool step are both 2 pool=nn.AvgPool2d(2,stride=2) img_2=pool(img) print(img) print(img_2)
output
# Original image tensor([[[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]]]) # The length and width of the pooled image are half of the original tensor([[[[ 2, 4], [10, 12]]]])
ceil_ The difference between setting mode to True and Fasle
import torch from torch import nn img=torch.arange(20,dtype=torch.float).reshape(1,1,4,5) pool_f=nn.AvgPool2d(2,stride=2,padding=0,ceil_mode=False) pool_t=nn.AvgPool2d(2,stride=2,padding=0,ceil_mode=True) img_2=pool_f(img) img_3=pool_t(img) print(img) print(img_2) print(img_3)
output
# Original image tensor([[[[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [10., 11., 12., 13., 14.], [15., 16., 17., 18., 19.]]]]) # By default, ceil_mode is False tensor([[[[ 3., 5.], [13., 15.]]]]) # ceil_mode is True tensor([[[[ 3.0000, 5.0000, 6.5000], [13.0000, 15.0000, 16.5000]]]]) # Since 5 cannot be divided by 2, one is rounded down and the other is rounded up
The difference between setting and not setting padding
import torch from torch import nn img=torch.arange(16,dtype=torch.float).reshape(1,1,4,4) pool_t=nn.AvgPool2d(2,stride=2,padding=1) pool_f=nn.AvgPool2d(2,stride=2) img_2=pool_t(img) img_3=pool_f(img) print(img) print(img_2) print(img_3)
output
# Original drawing tensor([[[[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]]]) # The filling width is 1, and the 0 filled by default will be used for pooling calculation tensor([[[[0.0000, 0.7500, 0.7500], [3.0000, 7.5000, 4.5000], [3.0000, 6.7500, 3.7500]]]]) # Unfilled results tensor([[[[ 2.5000, 4.5000], [10.5000, 12.5000]]]]) # The image size obtained by pooling the filled image can be calculated by the above formula
count_ include_ The difference between setting pad to True and False
import torch from torch import nn img=torch.arange(16,dtype=torch.float).reshape(1,1,4,4) pool_t=nn.AvgPool2d(2,stride=2,padding=1,count_include_pad=True) pool_f=nn.AvgPool2d(2,stride=2,padding=1,count_include_pad=False) img_2=pool_t(img) img_3=pool_f(img) print(img) print(img_2) print(img_3)
output
# Original drawing tensor([[[[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]]]) # Fill width is 1, count_include_pad is true by default # Populated 0 is used for pooled calculation tensor([[[[0.0000, 0.7500, 0.7500], [3.0000, 7.5000, 4.5000], [3.0000, 6.7500, 3.7500]]]]) # Populated 0 is not used for pooled calculations tensor([[[[ 0.0000, 1.5000, 3.0000], [ 6.0000, 7.5000, 9.0000], [12.0000, 13.5000, 15.0000]]]])
divisor_ Difference between override set and unset
import torch from torch import nn img=torch.arange(16,dtype=torch.float).reshape(1,1,4,4) pool_1=nn.AvgPool2d(2,stride=2) pool_d1=nn.AvgPool2d(2,stride=2,divisor_override=2) pool_d2=nn.AvgPool2d(2,stride=2,divisor_override=3) img_1=pool_1(img) img_2=pool_d1(img) img_3=pool_d2(img) print(img) print(img_1) print(img_2) print(img_3)
output
# Original image tensor([[[[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]]]) # The director is not set_ Override, which is a normal average pooling operation tensor([[[[ 2.5000, 4.5000], [10.5000, 12.5000]]]]) # divisor_override is set to 2. Take the four elements in the upper left corner as an example # After pooling, the first element is the sum of the four elements in the upper left corner of the original figure divided by 2 tensor([[[[ 5., 9.], [21., 25.]]]]) # divisor_override is set to 3, # That is, the sum of the four elements divided by 3 tensor([[[[ 3.3333, 6.0000], [14.0000, 16.6667]]]])
Official documents
torch.nn.AvgPool2d(): https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html?highlight=avgpool2d#torch.nn.AvgPool2d