Artificial Inteligence
  • Preface
  • Introduction
  • Machine Learning
    • Linear Algebra
    • Supervised Learning
      • Neural Networks
      • Linear Classification
      • Loss Function
      • Model Optimization
      • Backpropagation
      • Feature Scaling
      • Model Initialization
      • Recurrent Neural Networks
        • Machine Translation Using RNN
    • Deep Learning
      • Convolution
      • Convolutional Neural Networks
      • Fully Connected Layer
      • Relu Layer
      • Dropout Layer
      • Convolution Layer
        • Making faster
      • Pooling Layer
      • Batch Norm layer
      • Model Solver
      • Object Localization and Detection
      • Single Shot Detectors
        • Yolo
        • SSD
      • Image Segmentation
      • GoogleNet
      • Residual Net
      • Deep Learning Libraries
    • Unsupervised Learning
      • Principal Component Analysis
      • Generative Models
    • Distributed Learning
    • Methodology for usage
      • Imbalanced/Missing Datasets
  • Artificial Intelligence
    • OpenAI Gym
    • Tree Search
    • Markov Decision process
    • Reinforcement Learning
      • Q_Learning_Simple
      • Deep Q Learning
      • Deep Reinforcement Learning
    • Natural Language Processing
      • Word2Vec
  • Appendix
    • Statistics and Probability
      • Probability
        • Markov Chains
        • Random Walk
    • Lua and Torch
    • Tensorflow
      • Multi Layer Perceptron MNIST
      • Convolution Neural Network MNIST
      • SkFlow
    • PyTorch
      • Transfer Learning
      • DataLoader and DataSets
      • Visualizing Results
Powered by GitBook
On this page
  • Introduction
  • How it works
  • Where to use Dropout layers
  • Implementation
  • Python Forward propagation
  • Python Backward propagation
  • Next Chapter

Was this helpful?

  1. Machine Learning
  2. Deep Learning

Dropout Layer

PreviousRelu LayerNextConvolution Layer

Last updated 5 years ago

Was this helpful?

Introduction

Dropout is a technique used to improve over-fit on neural networks, you should use Dropout along with other techniques like L2 Regularization.

Bellow we have a classification error (Not including loss), observe that the test/validation error is smaller using dropout

As other regularization techniques the use of dropout also make the training loss error a little worse. But that's the idea, basically we want to trade training performance for more generalization. Remember that's more capacity you add on your model (More layers, or more neurons) more prone to over-fit it becomes.

Bellow we have a plot showing both training, and validation loss with and without dropout

How it works

Basically during training half of neurons on a particular layer will be deactivated. This improve generalization because force your layer to learn with different neurons the same "concept".

During the prediction phase the dropout is deactivated.

Where to use Dropout layers

Normally some deep learning models use Dropout on the fully connected layers, but is also possible to use dropout after the max-pooling layers, creating some kind of image noise augmentation.

Implementation

In order to implement this neuron deactivation, we create a mask(zeros and ones) during forward propagation. This mask is applied to the layer outputs during training and cached for future use on back-propagation. As explained before this dropout mask is used only during training.

On the backward propagation we're interested on the neurons that was activated (we need to save mask from forward propagation). Now with those neurons selected we just back-propagate dout. The dropout layer has no learnable parameters, just it's input (X). During back-propagation we just return "dx". In other words: dx=dout.maskcacheddx=dout . mask_{cached}dx=dout.maskcached​

Python Forward propagation

Python Backward propagation

Next Chapter

Next chapter we will learn about Convolution layer