Artificial Inteligence
  • Preface
  • Introduction
  • Machine Learning
    • Linear Algebra
    • Supervised Learning
      • Neural Networks
      • Linear Classification
      • Loss Function
      • Model Optimization
      • Backpropagation
      • Feature Scaling
      • Model Initialization
      • Recurrent Neural Networks
        • Machine Translation Using RNN
    • Deep Learning
      • Convolution
      • Convolutional Neural Networks
      • Fully Connected Layer
      • Relu Layer
      • Dropout Layer
      • Convolution Layer
        • Making faster
      • Pooling Layer
      • Batch Norm layer
      • Model Solver
      • Object Localization and Detection
      • Single Shot Detectors
        • Yolo
        • SSD
      • Image Segmentation
      • GoogleNet
      • Residual Net
      • Deep Learning Libraries
    • Unsupervised Learning
      • Principal Component Analysis
      • Generative Models
    • Distributed Learning
    • Methodology for usage
      • Imbalanced/Missing Datasets
  • Artificial Intelligence
    • OpenAI Gym
    • Tree Search
    • Markov Decision process
    • Reinforcement Learning
      • Q_Learning_Simple
      • Deep Q Learning
      • Deep Reinforcement Learning
    • Natural Language Processing
      • Word2Vec
  • Appendix
    • Statistics and Probability
      • Probability
        • Markov Chains
        • Random Walk
    • Lua and Torch
    • Tensorflow
      • Multi Layer Perceptron MNIST
      • Convolution Neural Network MNIST
      • SkFlow
    • PyTorch
      • Transfer Learning
      • DataLoader and DataSets
      • Visualizing Results
Powered by GitBook
On this page
  • Introduction
  • Some advantages
  • PyTorch Components
  • How it differs from Tensorflow/Theano
  • The Basics:
  • Autograd and variables
  • Complete example
  • References:

Was this helpful?

  1. Appendix

PyTorch

Introduction

PyTorch is another deep learning library that's is actually a fork of Chainer(Deep learning library completely on python) with the capabilities of torch. Basically it's the facebook solution to merge torch with python.

Some advantages

  • Easy to Debug and understand the code

  • Has as many type of layers as Torch (Unpool, CONV 1,2,3D, LSTM, Grus)

  • Lot's of loss functions

  • Can be considered as a Numpy extension to GPUs

  • Faster than others "define-by-run" libraries, like chainer and dynet

  • Allow to build networks which structure is dependent on the computation itself (Useful on reinforcement learning)

PyTorch Components

Package

Description

torch

Numpy like library with GPU support

torch.autograd

Give differentiation support for all torch ops

torch.nn

Neural network library integrated with autograd

torch.optim

Optimization for torch.nn (ADAM, SGD, RMSPROP, etc...)

torch.multiprocessing

Memory sharing between tensors

torch.utils

DataLoader, Training and other utility functions

torch.legacy

Old code ported from Torch

How it differs from Tensorflow/Theano

The major difference from Tensorflow is that PyTorch methodology is considered "define-by-run" while Tensorflow is considered "defined-and-run", so on PyTorch you can for instance change your model on run-time, debug easily with any python debugger, while tensorflow has always a graph definition/build. You can consider tensorflow as a more production tool while PyTorch is more a research tool.

The Basics:

Here we will see how to create tensors, and do some manipulation:

import torch
import numpy as np

# Create a tensor on torch
a = torch.rand(3, 3)

# Create a matrix on numpy and conver to PyTorch
b_npy = np.array([[1,2,3],[4,5,6],[7,8,9]])
# Convert from numpy to torch
b = torch.from_numpy(b_npy)

print(a)
print(b)

# Get a specific element
print(b[1,1])

# Get a range of elements
print(b[1:None,1:None])

# Set elements on array
a[1:None,1:None] = 0
print(a)

Create tensors filled with some value

import torch

a = torch.ones(2,3)
b = torch.zeros(3,2)
print(a)
print(b)

Now we will do some computation on the GPU

import torch
import numpy as np

# Define tensors on the GPU
a = torch.rand(2, 3).cuda()
b = torch.rand(2, 3).cuda()

# Define some operation (will execute on the GPU)
c = (a + b) * 2

# Print "c" contents and shape(size)
print(c)
print(c.size())

Autograd and variables

The Autograd on PyTorch is the component responsible to do the backpropagation, as on Tensorflow you only need to define the forward propagation. PyTorch autograd looks a lot like TensorFlow: in both frameworks we define a computational graph, and use automatic differentiation to compute gradients.

We just need to wrap tensors with Variable objects, a Variable represents a node in a computational graph. They are not like tensorflow placeholders, on PyTorch you place the values directly on the model. Again to include a tensor on the graph wrap it with a variable.

Consider the following simple graph:

import torch
from torch.autograd import Variable

# Define scalar a=2, b=3
a = Variable(torch.ones(1, 1) * 2, requires_grad=True)
b = Variable(torch.ones(1, 1) * 3, requires_grad=True)
c = Variable(torch.ones(1, 1) * 4, requires_grad=True)

# Define the function "out" having 2 parameters a,b
out = (a*b)+c
#c = torch.mul(a,b)+c
print('Value out:',out)

# Do the backpropagation
out.backward()

# Get dout/da (Derivative of out w.r.t to a)
print('Derivative of out w.r.t to a:',a.grad)
print('Derivative of out w.r.t to b:',b.grad)
print('Derivative of out w.r.t to c:',c.grad)

Complete example

Here we mix the concepts and show how to train a MNIST dataset using CNN

# Import libraries
import torch
from torch.autograd import Variable
import torchvision.datasets as dsets
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F

# Hyper Parameters
num_epochs = 5
batch_size = 50
learning_rate = 0.001

# MNIST Dataset
train_dataset = dsets.MNIST(root='../data/',
                            train=True, 
                            transform=transforms.ToTensor(),
                            download=True)

test_dataset = dsets.MNIST(root='../data/',
                           train=False, 
                           transform=transforms.ToTensor())


# Data Loader (Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size, 
                                          shuffle=False)


# CNN Model (2 conv layer) nn.Module is the base class to all neural networks
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=5, padding=2),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(16, 32, kernel_size=5, padding=2),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.fc = nn.Linear(7*7*32, 10)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

cnn = CNN()
cnn.cuda()
print(cnn)

# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cnn.parameters(), lr=learning_rate)

# Train the Model
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        images = Variable(images)
        labels = Variable(labels)

        images, labels = images.cuda(), labels.cuda()

        # Forward + Backward + Optimize
        optimizer.zero_grad()
        outputs = cnn(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        if (i+1) % 500 == 0:
            print ('Epoch [%d/%d], Iter [%d/%d] Loss: %.4f' 
                   %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0]))


# Test the Model
cnn.eval()  # Change model to 'eval' mode (BN uses moving mean/var).
correct = 0
total = 0
for images, labels in test_loader:
    images = Variable(images)
    images, labels = images.cuda(), labels.cuda()
    outputs = cnn(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum()

print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total))

# Save the Trained Model
torch.save(cnn.state_dict(), 'cnn.pkl')

References:

PreviousSkFlowNextTransfer Learning

Last updated 5 years ago

Was this helpful?

http://pytorch.org/
http://pytorch.org/docs/index.html
https://hackernoon.com/how-is-pytorch-different-from-tensorflow-2c90f44747d6
https://blog.paperspace.com/adversarial-autoencoders-with-pytorch/
https://devblogs.nvidia.com/parallelforall/recursive-neural-networks-pytorch/
http://blog.outcome.io/pytorch-quick-start-classifying-an-image/
https://github.com/ritchieng/the-incredible-pytorch
https://www.youtube.com/watch?v=nbJ-2G2GXL0
https://www.youtube.com/watch?v=4RzoFWre44Y&t=7s
https://www.youtube.com/watch?v=hiIqRUseouQ
https://github.com/PythonWorkshop/Intro-to-TensorFlow-and-PyTorch/blob/master/PyTorch%20Tutorial.ipynb
https://github.com/PythonWorkshop/Intro-to-TensorFlow-and-PyTorch
https://github.com/pytorch/examples
http://pytorch.org/tutorials/
http://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html
http://www.cs.toronto.edu/~rgrosse/courses/csc321_2017/tutorials/tut4.pdf
http://www.cs.toronto.edu/~rgrosse/courses/csc321_2017/slides/lec1.pdf
https://discuss.pytorch.org/t/understanding-loss-function-gradients/771/5
https://discuss.pytorch.org/t/visual-watcher-when-training-evaluating-or-tensorboard-equivalence/146/8
https://github.com/jcjohnson/pytorch-examples
https://discuss.pytorch.org/t/print-autograd-graph/692/8
https://discuss.pytorch.org/t/print-autograd-graph/692
https://github.com/szagoruyko/functional-zoo/blob/master/visualize.py
https://github.com/szagoruyko/functional-zoo/blob/master/resnet-18-export.ipynb
https://www.safaribooksonline.com/library/view/strata-hadoop/9781491976166/video302404.html
https://discuss.pytorch.org/t/build-your-own-loss-function-in-pytorch/235
http://pytorch.org/docs/notes/extending.html#extending-torch-autograd
http://blog.gaurav.im/2017/04/24/a-gentle-intro-to-pytorch/
https://stackoverflow.com/questions/41924453/pytorch-how-to-use-dataloaders-for-custom-datasets
https://discuss.pytorch.org/t/saving-and-loading-a-model-in-pytorch/2610/3
https://discuss.pytorch.org/t/load-a-saved-model/109
https://discuss.pytorch.org/t/saving-torch-models/838
https://discuss.pytorch.org/t/saving-custom-models/621/4
https://github.com/jcjohnson/pytorch-examples
https://discuss.pytorch.org/t/build-your-own-loss-function-in-pytorch/235/19
https://github.com/pytorch/examples/blob/master/imagenet/main.py
https://github.com/pytorch/examples/blob/master/mnist/main.py
https://discuss.pytorch.org/t/discussion-about-datasets-and-dataloaders/296
https://www.kaggle.com/mratsim/starting-kit-for-pytorch-deep-learning
https://iamtrask.github.io/2017/01/15/pytorch-tutorial/