Artificial Inteligence
  • Preface
  • Introduction
  • Machine Learning
    • Linear Algebra
    • Supervised Learning
      • Neural Networks
      • Linear Classification
      • Loss Function
      • Model Optimization
      • Backpropagation
      • Feature Scaling
      • Model Initialization
      • Recurrent Neural Networks
        • Machine Translation Using RNN
    • Deep Learning
      • Convolution
      • Convolutional Neural Networks
      • Fully Connected Layer
      • Relu Layer
      • Dropout Layer
      • Convolution Layer
        • Making faster
      • Pooling Layer
      • Batch Norm layer
      • Model Solver
      • Object Localization and Detection
      • Single Shot Detectors
        • Yolo
        • SSD
      • Image Segmentation
      • GoogleNet
      • Residual Net
      • Deep Learning Libraries
    • Unsupervised Learning
      • Principal Component Analysis
      • Generative Models
    • Distributed Learning
    • Methodology for usage
      • Imbalanced/Missing Datasets
  • Artificial Intelligence
    • OpenAI Gym
    • Tree Search
    • Markov Decision process
    • Reinforcement Learning
      • Q_Learning_Simple
      • Deep Q Learning
      • Deep Reinforcement Learning
    • Natural Language Processing
      • Word2Vec
  • Appendix
    • Statistics and Probability
      • Probability
        • Markov Chains
        • Random Walk
    • Lua and Torch
    • Tensorflow
      • Multi Layer Perceptron MNIST
      • Convolution Neural Network MNIST
      • SkFlow
    • PyTorch
      • Transfer Learning
      • DataLoader and DataSets
      • Visualizing Results
Powered by GitBook
On this page
  • Introduction
  • Forward Propagation
  • Python Forward propagation
  • Matlab Forward propagation
  • Backward Propagation
  • Python Backward propagation
  • Improving performance
  • Next Chapter

Was this helpful?

  1. Machine Learning
  2. Deep Learning

Pooling Layer

PreviousMaking fasterNextBatch Norm layer

Last updated 5 years ago

Was this helpful?

Introduction

The pooling layer, is used to reduce the spatial dimensions, but not depth, on a convolution neural network, model, basically this is what you gain:

  1. By having less spatial information you gain computation performance

  2. Less spatial information also means less parameters, so less chance to over-fit

  3. You get some translation invariance

Some projects don't use pooling, specially when they want to "learn" some object specific position. Learn how to play atari games.

On the diagram bellow we show the most common type of pooling the max-pooling layer, which slides a window, like a normal convolution, and get the biggest value on the window as the output.

The most important parameters to play:

  • Input: H1 x W1 x Depth_In x N

  • Stride: Scalar that control the amount of pixels that the window slide.

  • K: Kernel size

Regarding it's Output H2 x W2 x Depth_Out x N:

W2=(W1−K)/S+1H2=(H1−K)/S+1Depthout=DepthInW_2 = (W_1 - K)/S + 1\\ H_2 = (H_1 - K)/S + 1 \\ Depth_{out} = Depth_{In}W2​=(W1​−K)/S+1H2​=(H1​−K)/S+1Depthout​=DepthIn​

It's also valid to point out that there is no learnable parameters on the pooling layer. So it's backpropagation is simpler.

Forward Propagation

The window movement mechanism on pooling layers is the same as convolution layer, the only change is that we will select the biggest value on the window.

Python Forward propagation

Matlab Forward propagation

Backward Propagation

In other words the gradient with respect to the input of the max pooling layer will be a tensor make of zeros except on the places that was selected during the forward propagation.

Python Backward propagation

Improving performance

On future chapter we will learn a technique that improves the convolution performance, until them we will stick with the naive implementation.

Next Chapter

Next chapter we will learn about Batch Norm layer

From the we learn that the max node simply act as a router, giving the input gradient "dout" to the input that has value bigger than zero.

You can consider that the max pooling use a series of max nodes, on it's computation graph. So consider the backward propagation of the max pooling layer as a product between a mask containing all elements that were selected during the forward propagation and dout.

backpropagation chapter