On this book I stressed out the importance of knowing how to write your own deep learning/artificial intelligence library. But is also very important specially while researching some topic, to understand the most common libraries. This chapter will teach the basics on Torch, but before that we're going also to learn Lua.
Lua language
Lua was first created to be used on embedded systems, the idea was to have a simple cross-platform and fast language. One the main features of Lua is it's easy integration with C/C++.
Lua was originally designed in 1993 as a language for extending software applications to meet the increasing demand for customization at the time.
This extension means that you could have a large C/C++ program and, some parts in Lua where you could easily change without the need to recompile everything.
Torch
Torch is a scientific computing framework based on Lua with CPU and GPU backends. You can imagine like a Numpy but with CPU and GPU implementation. Some nice features:
Efficient linear algebra functions with GPU support
Neural Network package, with automatic differentiation (No need to backpropagate manually)
Multi-GPU support
First contact with Lua
Bellow we have some simple examples on Lua just to have some contact with the language.
print("Hello World") -- First thing, note that there is no main...
--[[
This is how we do a multi-line comment
on lua, to execute a lua program just use...
lua someFile.lua
]]
someVar = "Leonardo"
io.write("Size of variable is ", #someVar, " and it's value is: \"", someVar, "\"\n")
-- Variables on lua are dynamically typed...
someVar = 10; -- You can use ";" to end a statement
io.write("Now it's value is:\"", someVar, "\"")
Lua datatypes
The language offer those basic types:
Numbers(Float)
string
boolean
table
print(type(someVar))
someVar = 'Leonardo' -- Strings can use use simple quotes
print(type(someVar))
someVar = true
print(type(someVar))
someVar = {1,2,"leo",true}
print(type(someVar))
Doing some math
Normally we will rely on Torch, but Lua has some math support as well.
Just the simple if-then-else. Lua does not have switch statement.
age = 17
if age < 16 then
print(string.format("You are still a kid with %d years old\n", age))
elseif (age == 34) or (age==35) then
print("Getting old leo...")
else
print("Hi young man")
end
-- Lua does not have ternary operators
canVote = age > 18 and true or false -- canVote=true if age>18 else canVote=false
io.write("Can I vote: ", tostring(canVote))
Loops
Lua have while, repeat and for loops. For loops has also a "for-each" extension to iterate on tables.
i = 1
while (i <= 10) do
io.write(i,"\n")
i = i+1
if i==4 then break end
end
-- Initial value, end value, increment at each loop...
for i=1,3,1 do
io.write(i,"\n")
end
-- Create a table which is a list of items like an array
someTable = {"January", "February", "March", "April",10}
-- Iterate on table months
for keyVar, valueVar in pairs(someTable) do
io.write(valueVar, "(key=", keyVar, "), ")
end
Defining functions in Lua is quite easy, it's syntax reminds matlab.
-- Function definition
function getSum(a,b)
return a+b
end
-- Call function
print(string.format("5 + 2 = %d", getSum(5,2)))
Tables
On Lua we use tables for everything else (ie: Lists, Dictionaries, Classes, etc...)
-- tables
dict = {a = 1, b = 2, c = 3}
list = {10,20,30}
-- two prints that display the same value
print(dict.a)
print(dict["a"])
-- Tables start with 1 (Like matlab)
print(list[1])
-- You can also add functions on tables
tab = {1,3,4}
-- Add function sum to table tab
function tab.sum ()
c = 0
for i=1,#tab do
c = c + tab[i]
end
return c
end
print(tab:sum()) -- displays 8 (the colon is used for calling methods)
-- tab:sum() means tab.sum(tab)
print(tab.sum()) -- On this case it will also work
print(tab)
Lua does not support directly OOP, but you can emulate all it's main functionalities (Inheritance, Encapsulation) with tables and metatables
Metatable tutorial: Used to override operations (metamethods) on tables.
--[[
Create a class "Animal" with properties:height,weight,name,sound
and methods: new,getInfo,saySomething
]]
-- Define the defaults for our table
Animal = {height = 0, weight = 0, name = "No Name", sound = "No Sound"}
-- Constructor
function Animal:new (height, weight, name, sound)
-- Set a empty metatable to the table Animal (Crazy whay to create an instance)
setmetatable({}, Animal)
-- Self is a reference to this table instance
self.height = height
self.weight = weight
self.name = name
self.sound = sound
return self
end
-- Some method
function Animal:getInfo()
animalStr = string.format("%s weighs %.1f kg, is %.1fm in tall", self.name, self.weight, self.height)
return animalStr
end
function Animal:saySomething()
print(self.sound)
end
-- Create an Animal
flop = Animal:new(1,10.5,"Flop","Auau")
print(flop.name) -- same as flop["name"]
print(flop:getInfo()) -- same as flop.getInfo(flop)
print(flop:saySomething())
-- Other way to say the samething
print(flop["name"])
print(flop.getInfo(flop))
-- Type of our object
print(type(flop))
Flop
Flop weighs 10.5 kg, is 1.0m in tall
Auau
Flop
Flop weighs 10.5 kg, is 1.0m in tall
table
File I/O
-- Open a file to write
file = io.open("./output.txt", "w")
-- Copy the content of the file input.txt to test.txt
for line in io.lines("./input.txt") do
print(line)
file:write(string.format("%s from input (At output)\n", line)) -- write on file
end
file:close()
Line 1 at input
Line 2 at input
Run console commands
local t = os.execute("ls")
print(t)
local catResult = os.execute("cat output.txt")
print(catResult)
FirstContactTorch.ipynb
input.txt
LuaLanguage.ipynb
output.txt
pedestrianSign.png
plot.png
true
Line 1 at input from input (At output)
Line 2 at input from input (At output)
true
First contact with Torch
On this section we're going to see how to do simple operations with Torch, more complex stuff will be dealt latter.
One of the torch objectives is to give some matlab functionality, an usefull cheetsheat can be found here:
-- Include torch library
require 'torch'; -- Like matlab the ";" also avoid echo the output
-- Create a 2x4 matrix
m = torch.Tensor({{1, 2, 3, 4}, {5, 6, 7, 8}})
print(m)
-- Get element at second row and third collumn
print(m[{2,3}])
-- Define some Matrix/Tensors
a = torch.Tensor(5,3) -- construct a 5x3 matrix/tensor, uninitialized
a = torch.rand(5,3) -- Create a 5x3 matrix/tensor with random values
b=torch.rand(3,4) -- Create a 3x4 matrix/tensor with random values
-- You can also fill a matrix with values (On this case with zeros)
allZeros = torch.Tensor(2,2):fill(0)
print(allZeros)
-- Matrix multiplcation and it's syntax variants
c = a*b
c = torch.mm(a,b)
print(c)
d=torch.Tensor(5,4)
d:mm(a,b) -- store the result of a*b in c
-- Transpose a matrix
m_trans = m:t()
print(m_trans)
-- Include torch (cuda) library
require 'cutorch'
-- Move arrays to GPU (and convert it's types to cuda types)
a = a:cuda()
b = b:cuda()
d = d:cuda()
-- Same multiplication just different syntax
c = a*b
d:mm(a,b)
print(c)
Plot = require 'itorch.Plot'
-- Give me 10 random numbers
local y = torch.randn(10)
-- Get 1d tensor from 0 to 9 (10 elements)
local x = torch.range(0, 9)
Plot():line(x, y,'red' ,'Sinus Wave'):title('Simple Plot'):draw()
Starting with nn (XOR problem)
require "nn"
-- make a multi-layer perceptron
mlp = nn.Sequential();
-- 2 inputs, one output 1 hidden layer with 20 neurons
inputs = 2; outputs = 1; hiddenUnits = 20;
-- Mount the model
mlp:add(nn.Linear(inputs, hiddenUnits))
mlp:add(nn.Tanh())
mlp:add(nn.Linear(hiddenUnits, outputs))
Define the loss function
On torch the loss function is called criterion, as on this case we're dealling with a binary classification, we will choose the Mean Squared Error criterion
criterion_MSE = nn.MSECriterion()
Training Manually
Here we're going to back-propagate our model to get the output related to the loss gradient dout then use gradient descent to update the parameters.
for i = 1,2500 do
-- random sample
local input= torch.randn(2); -- normally distributed example in 2d
local output= torch.Tensor(1);
-- Create XOR lables on the fly....
if input[1] * input[2] > 0 then
output[1] = -1
else
output[1] = 1
end
-- Feed to the model (with current set of weights), then calculate a loss
criterion_MSE:forward(mlp:forward(input), output)
-- Reset the current gradients before backpropagate (Always do)
mlp:zeroGradParameters()
-- Backpropagate the loss to the hidden layer
mlp:backward(input, criterion_MSE:backward(mlp.output, output))
-- Update parameters(Gradient descent) with alpha=0.01
mlp:updateParameters(0.01)
end
-0.8257
[torch.DoubleTensor of size 1]
0.6519
[torch.DoubleTensor of size 1]
0.4468
[torch.DoubleTensor of size 1]
-0.7814
[torch.DoubleTensor of size 1]
Trainning with optimim
Torch provides a standard way to optimize any function with respect to some parameters. In our case, our function will be the loss of our network, given an input, and a set of weights. The goal of training a neural net is to optimize the weights to give the lowest loss over our training set of input data. So, we are going to use optim to minimize the loss with respect to the weights, over our training set.
-- Create a dataset (128 elements)
batchSize = 128
batchInputs = torch.Tensor(batchSize, inputs)
batchLabels = torch.DoubleTensor(batchSize)
for i=1,batchSize do
local input = torch.randn(2) -- normally distributed example in 2d
local label = 1
if input[1]*input[2]>0 then -- calculate label for XOR function
label = -1;
end
batchInputs[i]:copy(input)
batchLabels[i] = label
end
-- Get flatten parameters (Needed to use optim)
params, gradParams = mlp:getParameters()
-- Learning parameters
optimState = {learningRate=0.01}
require 'optim'
for epoch=1,200 do
-- local function we give to optim
-- it takes current weights as input, and outputs the loss
-- and the gradient of the loss with respect to the weights
-- gradParams is calculated implicitly by calling 'backward',
-- because the model's weight and bias gradient tensors
-- are simply views onto gradParams
local function feval(params)
gradParams:zero()
local outputs = mlp:forward(batchInputs)
local loss = criterion_MSE:forward(outputs, batchLabels)
local dloss_doutput = criterion_MSE:backward(outputs, batchLabels)
mlp:backward(batchInputs, dloss_doutput)
return loss,gradParams
end
optim.sgd(feval, params, optimState)
end
-0.6607
[torch.DoubleTensor of size 1]
0.5321
[torch.DoubleTensor of size 1]
0.8285
[torch.DoubleTensor of size 1]
-0.7458
[torch.DoubleTensor of size 1]