Pooling Layer
Introduction
The pooling layer, is used to reduce the spatial dimensions, but not depth, on a convolution neural network, model, basically this is what you gain:
By having less spatial information you gain computation performance
Less spatial information also means less parameters, so less chance to over-fit
You get some translation invariance
Some projects don't use pooling, specially when they want to "learn" some object specific position. Learn how to play atari games.
On the diagram bellow we show the most common type of pooling the max-pooling layer, which slides a window, like a normal convolution, and get the biggest value on the window as the output.
The most important parameters to play:
Input: H1 x W1 x Depth_In x N
Stride: Scalar that control the amount of pixels that the window slide.
K: Kernel size
Regarding it's Output H2 x W2 x Depth_Out x N:
It's also valid to point out that there is no learnable parameters on the pooling layer. So it's backpropagation is simpler.
Forward Propagation
The window movement mechanism on pooling layers is the same as convolution layer, the only change is that we will select the biggest value on the window.
Python Forward propagation
Matlab Forward propagation
Backward Propagation
From the backpropagation chapter we learn that the max node simply act as a router, giving the input gradient "dout" to the input that has value bigger than zero.
In other words the gradient with respect to the input of the max pooling layer will be a tensor make of zeros except on the places that was selected during the forward propagation.
Python Backward propagation
Improving performance
On future chapter we will learn a technique that improves the convolution performance, until them we will stick with the naive implementation.
Next Chapter
Next chapter we will learn about Batch Norm layer
Last updated