Artificial Inteligence
  • Preface
  • Introduction
  • Machine Learning
    • Linear Algebra
    • Supervised Learning
      • Neural Networks
      • Linear Classification
      • Loss Function
      • Model Optimization
      • Backpropagation
      • Feature Scaling
      • Model Initialization
      • Recurrent Neural Networks
        • Machine Translation Using RNN
    • Deep Learning
      • Convolution
      • Convolutional Neural Networks
      • Fully Connected Layer
      • Relu Layer
      • Dropout Layer
      • Convolution Layer
        • Making faster
      • Pooling Layer
      • Batch Norm layer
      • Model Solver
      • Object Localization and Detection
      • Single Shot Detectors
        • Yolo
        • SSD
      • Image Segmentation
      • GoogleNet
      • Residual Net
      • Deep Learning Libraries
    • Unsupervised Learning
      • Principal Component Analysis
      • Generative Models
    • Distributed Learning
    • Methodology for usage
      • Imbalanced/Missing Datasets
  • Artificial Intelligence
    • OpenAI Gym
    • Tree Search
    • Markov Decision process
    • Reinforcement Learning
      • Q_Learning_Simple
      • Deep Q Learning
      • Deep Reinforcement Learning
    • Natural Language Processing
      • Word2Vec
  • Appendix
    • Statistics and Probability
      • Probability
        • Markov Chains
        • Random Walk
    • Lua and Torch
    • Tensorflow
      • Multi Layer Perceptron MNIST
      • Convolution Neural Network MNIST
      • SkFlow
    • PyTorch
      • Transfer Learning
      • DataLoader and DataSets
      • Visualizing Results
Powered by GitBook
On this page
  • Introduction
  • Residual Block
  • Caffe Example

Was this helpful?

  1. Machine Learning
  2. Deep Learning

Residual Net

PreviousGoogleNetNextDeep Learning Libraries

Last updated 5 years ago

Was this helpful?

Introduction

This chapter will present the 2016 state of the art on object classification. The ResidualNet it's basically a 150 deep convolution neural network made by equal "residual" blocks.

The problem is for real deep networks (more than 30 layers), all the known techniques (Relu, dropout, batch-norm, etc...) are not enough to do a good end-to-end training. This contrast with the common "empirical proven knowledge" that deeper is better.

The idea of the residual network is use blocks that re-route the input, and add to the concept learned from the previous layer. The idea is that during learning the next layer will learn the concepts of the previous layer plus the input of that previous layer. This would work better than just learn a concept without a reference that was used to learn that concept.

Another way to visualize their solution is remember that the back-propagation of a sum node will replicate the input gradient with no degradation.

Bellow we show an example of a 34-deep residual net.

The ResidualNet creators proved empiricaly that it's easier to train a 34-layer residual compared to a 34-layer cascaded (Like VGG).

Observe that on the end of the residual net there is only one fully connected layer followed by a previous average pool.

Residual Block

At it's core the residual net is formed by the following structure.

Basically this jump and adder creates a path for back-propagation, allowing even really deep models to be trained.

As mention before the Batch-Norm block alleviate the network initialization, but it can be omitted for not so deep models (less than 50 layers).

Again like googlenet we must use bottlenecks to avoid a parameter explosion.

Just to remember for the bottleneck to work the previous layer must have same depth.

Caffe Example

Here we show 2 cascaded residual blocks form residual net, due to difficulties with batch-norm layers, they were omitted but still residual net gives good results.