# Lua and Torch

On this book I stressed out the importance of knowing how to write your own deep learning/artificial intelligence library. But is also very important specially while researching some topic, to understand the most common libraries. This chapter will teach the basics on Torch, but before that we're going also to learn Lua.

# Lua language

Lua was first created to be used on embedded systems, the idea was to have a simple cross-platform and fast language. One the main features of Lua is it's easy integration with C/C++.

Lua was originally designed in 1993 as a language for extending software applications to meet the increasing demand for customization at the time.

This extension means that you could have a large C/C++ program and, some parts in Lua where you could easily change without the need to recompile everything.

# Torch

Torch is a scientific computing framework based on Lua with CPU and GPU backends. You can imagine like a Numpy but with CPU and GPU implementation. Some nice features:

• Efficient linear algebra functions with GPU support

• Neural Network package, with automatic differentiation (No need to backpropagate manually)

• Multi-GPU support

# First contact with Lua

Bellow we have some simple examples on Lua just to have some contact with the language.

print("Hello World") -- First thing, note that there is no main...--[[This is how we do a multi-line commenton lua, to execute a lua program just use...lua someFile.lua]]
someVar = "Leonardo"io.write("Size of variable is ", #someVar, " and it's value is: \"", someVar, "\"\n")-- Variables on lua are dynamically typed...someVar = 10; -- You can use ";" to end a statementio.write("Now it's value is:\"", someVar, "\"")

## Lua datatypes

The language offer those basic types:

• Numbers(Float)

• string

• boolean

• table

print(type(someVar))someVar = 'Leonardo' -- Strings can use use simple quotesprint(type(someVar))someVar = trueprint(type(someVar))someVar = {1,2,"leo",true}print(type(someVar))

## Doing some math

Normally we will rely on Torch, but Lua has some math support as well.

io.write("5 + 3 = ", 5+3, "\n")io.write("5 - 3 = ", 5-3, "\n")io.write("5 * 3 = ", 5*3, "\n")io.write("5 / 3 = ", 5/3, "\n")io.write("5.2 % 3 = ", 5%3, "\n")-- Generate random number between 0 and 1io.write("math.random() : ", math.random(0,3), "\n")-- Print float to 10 decimalsprint(string.format("Pi = %.10f", math.pi))
5 + 3 = 85 - 3 = 25 * 3 = 155 / 3 = 1.66666666666675.2 % 3 = 2math.random() : 2Pi = 3.1415926536

## Lua include (require)

The lua statement to include other lua files is the "require", as usual it is used to add some library

require 'image'pedestrian = image.load('./pedestrianSign.png')itorch.image(pedestrian)

## Conditionals

Just the simple if-then-else. Lua does not have switch statement.

age = 17if age < 16 then     print(string.format("You are still a kid with %d years old\n", age))elseif (age == 34) or (age==35) then    print("Getting old leo...")else    print("Hi young man")end​-- Lua does not have ternary operatorscanVote = age > 18 and true or false -- canVote=true if age>18 else canVote=falseio.write("Can I vote: ", tostring(canVote))

## Loops

Lua have while, repeat and for loops. For loops has also a "for-each" extension to iterate on tables.

i = 1while (i <= 10) do    io.write(i,"\n")    i = i+1    if i==4 then break endend
-- Initial value, end value, increment at each loop...for i=1,3,1 do    io.write(i,"\n")end
-- Create a table which is a list of items like an arraysomeTable = {"January", "February", "March", "April",10}​-- Iterate on table monthsfor keyVar, valueVar in pairs(someTable) do  io.write(valueVar, "(key=", keyVar, "), ")end
January(key=1), February(key=2), March(key=3), April(key=4), 10(key=5),

## Functions

Defining functions in Lua is quite easy, it's syntax reminds matlab.

-- Function definitionfunction getSum(a,b)    return a+bend​-- Call functionprint(string.format("5 + 2 = %d", getSum(5,2)))

## Tables

On Lua we use tables for everything else (ie: Lists, Dictionaries, Classes, etc...)

-- tablesdict = {a = 1, b = 2, c = 3} list = {10,20,30} ​-- two prints that display the same valueprint(dict.a)print(dict["a"])-- Tables start with 1 (Like matlab)print(list[1]) ​-- You can also add functions on tablestab = {1,3,4}-- Add function sum to table tabfunction tab.sum ()  c = 0  for i=1,#tab do    c = c + tab[i]  end  return cend​print(tab:sum()) -- displays 8 (the colon is used for calling methods) -- tab:sum() means tab.sum(tab)print(tab.sum()) -- On this case it will also workprint(tab)
1    1    10    8    8    {  1 : 1  2 : 3  3 : 4  sum : function: 0x4035ede8}

## Object oriented programming

Lua does not support directly OOP, but you can emulate all it's main functionalities (Inheritance, Encapsulation) with tables and metatables

Metatable tutorial: Used to override operations (metamethods) on tables.

--[[​Create a class "Animal" with properties:height,weight,name,soundand methods: new,getInfo,saySomething​]]​-- Define the defaults for our tableAnimal = {height = 0, weight = 0, name = "No Name", sound = "No Sound"}​-- Constructorfunction Animal:new (height, weight, name, sound)   -- Set a empty metatable to the table Animal (Crazy whay to create an instance)    setmetatable({}, Animal)  -- Self is a reference to this table instance  self.height = height  self.weight = weight  self.name = name  self.sound = sound   return selfend​-- Some methodfunction Animal:getInfo()   animalStr = string.format("%s weighs %.1f kg, is %.1fm in tall", self.name, self.weight, self.height)   return animalStrend​function Animal:saySomething()    print(self.sound)end
-- Create an Animalflop = Animal:new(1,10.5,"Flop","Auau")print(flop.name) -- same as flop["name"]print(flop:getInfo()) -- same as flop.getInfo(flop)print(flop:saySomething())​-- Other way to say the samethingprint(flop["name"]) print(flop.getInfo(flop))​-- Type of our objectprint(type(flop))
Flop    Flop weighs 10.5 kg, is 1.0m in tall    Auau    ​Flop    Flop weighs 10.5 kg, is 1.0m in tall    table

## File I/O

-- Open a file to writefile = io.open("./output.txt", "w")​-- Copy the content of the file input.txt to test.txtfor line in io.lines("./input.txt") do  print(line)  file:write(string.format("%s from input (At output)\n", line)) -- write on fileend​file:close()
Line 1 at input    Line 2 at input

## Run console commands

local t = os.execute("ls")print(t)local catResult = os.execute("cat output.txt")print(catResult)
FirstContactTorch.ipynbinput.txtLuaLanguage.ipynboutput.txtpedestrianSign.pngplot.pngtrue    ​​Line 1 at input from input (At output)Line 2 at input from input (At output)true

# First contact with Torch

On this section we're going to see how to do simple operations with Torch, more complex stuff will be dealt latter.

One of the torch objectives is to give some matlab functionality, an usefull cheetsheat can be found here:

-- Include torch libraryrequire 'torch'; -- Like matlab the ";" also avoid echo the output​-- Create a 2x4 matrixm = torch.Tensor({{1, 2, 3, 4}, {5, 6, 7, 8}})print(m)​-- Get element at second row and third collumnprint(m[{2,3}])
 1  2  3  4 5  6  7  8[torch.DoubleTensor of size 2x4]​7

## Some Matrix operations

-- Define some Matrix/Tensorsa = torch.Tensor(5,3) -- construct a 5x3 matrix/tensor, uninitializeda = torch.rand(5,3) -- Create a 5x3 matrix/tensor with random valuesb=torch.rand(3,4) -- Create a 3x4 matrix/tensor with random values​-- You can also fill a matrix with values (On this case with zeros)allZeros = torch.Tensor(2,2):fill(0)print(allZeros)​-- Matrix multiplcation and it's syntax variantsc = a*b c = torch.mm(a,b)print(c)d=torch.Tensor(5,4)d:mm(a,b) -- store the result of a*b in c​-- Transpose a matrixm_trans = m:t()print(m_trans)
 0  0 0  0[torch.DoubleTensor of size 2x2]​ 0.8259  0.6816  0.3766  0.7048 1.3681  0.9438  0.7739  1.1653 1.2885  0.9653  0.5573  0.9808 1.2556  0.8850  0.5501  0.9142 1.8468  1.3579  0.7680  1.3500[torch.DoubleTensor of size 5x4]​ 1  5 2  6 3  7 4  8[torch.DoubleTensor of size 4x2]

## Doing operations on GPU

-- Include torch (cuda) libraryrequire 'cutorch'​-- Move arrays to GPU (and convert it's types to cuda types)a = a:cuda()b = b:cuda()d = d:cuda()​-- Same multiplication just different syntaxc = a*bd:mm(a,b)​print(c)
 1.1058  0.6183  1.0518  0.7451 0.5932  0.8015  0.9441  0.5477 0.4915  0.8143  0.9613  0.4345 0.1699  0.6697  0.6656  0.2500 0.6525  0.6174  0.8894  0.4446[torch.CudaTensor of size 5x4]

## Plotting

Plot = require 'itorch.Plot'​-- Give me 10 random numberslocal y = torch.randn(10) ​-- Get 1d tensor from 0 to 9 (10 elements)local x = torch.range(0, 9)Plot():line(x, y,'red' ,'Sinus Wave'):title('Simple Plot'):draw()

## Starting with nn (XOR problem)

require "nn"​-- make a multi-layer perceptronmlp = nn.Sequential();  -- 2 inputs, one output 1 hidden layer with 20 neuronsinputs = 2; outputs = 1; hiddenUnits = 20; ​-- Mount the modelmlp:add(nn.Linear(inputs, hiddenUnits))mlp:add(nn.Tanh())mlp:add(nn.Linear(hiddenUnits, outputs))

### Define the loss function

On torch the loss function is called criterion, as on this case we're dealling with a binary classification, we will choose the Mean Squared Error criterion

criterion_MSE = nn.MSECriterion()

### Training Manually

Here we're going to back-propagate our model to get the output related to the loss gradient $dout$ then use gradient descent to update the parameters.

for i = 1,2500 do  -- random sample  local input= torch.randn(2);     -- normally distributed example in 2d  local output= torch.Tensor(1);  -- Create XOR lables on the fly....  if input[1] * input[2] > 0 then      output[1] = -1  else    output[1] = 1  end​  -- Feed to the model (with current set of weights), then calculate a loss  criterion_MSE:forward(mlp:forward(input), output)​  -- Reset the current gradients before backpropagate (Always do)  mlp:zeroGradParameters()  -- Backpropagate the loss to the hidden layer  mlp:backward(input, criterion_MSE:backward(mlp.output, output))  -- Update parameters(Gradient descent) with alpha=0.01  mlp:updateParameters(0.01)end

### Test the network

x = torch.Tensor(2)x[1] =  0.5; x[2] =  0.5; print(mlp:forward(x)) -- 0 XOR 0 = 0 (negative)x[1] =  0.5; x[2] = -0.5; print(mlp:forward(x)) -- 0 XOR 1 = 1 (positive)x[1] = -0.5; x[2] =  0.5; print(mlp:forward(x)) -- 1 XOR 0 = 1 (positive)x[1] = -0.5; x[2] = -0.5; print(mlp:forward(x)) -- 1 XOR 1 = 0 (negative)
-0.8257[torch.DoubleTensor of size 1]​ 0.6519[torch.DoubleTensor of size 1]​ 0.4468[torch.DoubleTensor of size 1]​-0.7814[torch.DoubleTensor of size 1]

### Trainning with optimim

Torch provides a standard way to optimize any function with respect to some parameters. In our case, our function will be the loss of our network, given an input, and a set of weights. The goal of training a neural net is to optimize the weights to give the lowest loss over our training set of input data. So, we are going to use optim to minimize the loss with respect to the weights, over our training set.

-- Create a dataset (128 elements)batchSize = 128batchInputs = torch.Tensor(batchSize, inputs)batchLabels = torch.DoubleTensor(batchSize)​for i=1,batchSize do  local input = torch.randn(2)     -- normally distributed example in 2d  local label = 1  if input[1]*input[2]>0 then     -- calculate label for XOR function    label = -1;  end  batchInputs[i]:copy(input)  batchLabels[i] = labelend
-- Get flatten parameters (Needed to use optim)params, gradParams = mlp:getParameters()-- Learning parametersoptimState = {learningRate=0.01}
require 'optim'​for epoch=1,200 do  -- local function we give to optim  -- it takes current weights as input, and outputs the loss  -- and the gradient of the loss with respect to the weights  -- gradParams is calculated implicitly by calling 'backward',  -- because the model's weight and bias gradient tensors  -- are simply views onto gradParams  local function feval(params)    gradParams:zero()​    local outputs = mlp:forward(batchInputs)    local loss = criterion_MSE:forward(outputs, batchLabels)    local dloss_doutput = criterion_MSE:backward(outputs, batchLabels)    mlp:backward(batchInputs, dloss_doutput)        return loss,gradParams  end  optim.sgd(feval, params, optimState)end

### Test the network

x = torch.Tensor(2)x[1] =  0.5; x[2] =  0.5; print(mlp:forward(x)) -- 0 XOR 0 = 0 (negative)x[1] =  0.5; x[2] = -0.5; print(mlp:forward(x)) -- 0 XOR 1 = 1 (positive)x[1] = -0.5; x[2] =  0.5; print(mlp:forward(x)) -- 1 XOR 0 = 1 (positive)x[1] = -0.5; x[2] = -0.5; print(mlp:forward(x)) -- 1 XOR 1 = 0 (negative)
-0.6607[torch.DoubleTensor of size 1]​ 0.5321[torch.DoubleTensor of size 1]​ 0.8285[torch.DoubleTensor of size 1]​-0.7458[torch.DoubleTensor of size 1]
​