On this chapter we're going to learn about tensorflow, which is the goolge library for machine learning. In simple words it's a library for numerical computation that uses graphs, on this graph the nodes are the operations, while the edges of this graph are tensors. Just to remember tensors, are multidimensional matrices, that will flow on the tensorflow graphs.

After this computational graph is created it will create a session that can be executed by multiple CPUs, GPUs distributed or not. Here are the main components of tensorflow:

  1. Variables: Retain values between sessions, use for weights/bias

  2. Nodes: The operations

  3. Tensors: Signals that pass from/to nodes

  4. Placeholders: Used to send data between your program and the tensorflow graph

  5. Session: Place when graph is executed.

The TensorFlow implementation translates the graph definition into executable operations distributed across available compute resources, such as the CPU or one of your computer's GPU cards. In general you do not have to specify CPUs or GPUs explicitly. TensorFlow uses your first GPU, if you have one, for as many operations as possible.

Your job as the "client" is to create symbolically this graph using code (C/C++ or python), and ask tensorflow to execute this graph. As you may imagine the tensorflow code for those "execution nodes" is some C/C++, CUDA high performance code. (Also difficult to understand).

For example, it is common to create a graph to represent and train a neural network in the construction phase, and then repeatedly execute a set of training ops in the graph in the execution phase.


If you have already a machine with python (anaconda 3.5) and the nvidia cuda drivers installed (7.5) install tensorflow is simple

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0rc0-cp35-cp35m-linux_x86_64.whl
sudo pip3 install --ignore-installed --upgrade $TF_BINARY_URL

If you still need to install some cuda drivers refer here for instructions

Simple example

Just as a hello world let's build a graph that just multiply 2 numbers. Here notice some sections of the code.

  • Import tensorflow library

  • Build the graph

  • Create a session

  • Run the session

Also notice that on this example we're passing to our model some constant values so it's not so useful in real life.

Exchanging data

Tensorflow allow exchanging data with your graph variables through "placeholders". Those placeholders can be assigned when we ask the session to run. Imagine placeholders as a way to send data to your graph when you run a session "session.run"

# Import tensorflow
import tensorflow as tf
# Build graph
a = tf.placeholder('float')
b = tf.placeholder('float')
# Graph
y = tf.mul(a,b)
# Create session passing the graph
session = tf.Session()
# Put the values 3,4 on the placeholders a,b
print session.run(y,feed_dict={a: 3, b:4})

Linear Regression on tensorflow

Now we're going to see how to create a linear regression system on tensorflow

# Import libraries (Numpy, Tensorflow, matplotlib)
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
get_ipython().magic(u'matplotlib inline')
# Create 100 points following a function y=0.1 * x + 0.3 with some normal random distribution
num_points = 100
vectors_set = []
for i in xrange(num_points):
x1 = np.random.normal(0.0, 0.55)
y1 = x1 * 0.1 + 0.3 + np.random.normal(0.0, 0.03)
vectors_set.append([x1, y1])
x_data = [v[0] for v in vectors_set]
y_data = [v[1] for v in vectors_set]
# Plot data
plt.plot(x_data, y_data, 'r*', label='Original data')

Now we're going to implement a graph with a function y=Wxdata+by=W*x_{data}+b, a loss function loss=mean[(yydata)2]loss = mean[(y-y_{data})^2]. The loss function will return a scalar value with the mean of all distances between our data, and the model prediction.

# Create our linear regression model
# Variables resides internally inside the graph memory
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1.0]))
y = W * x_data + b
# Define a loss function that take into account the distance between
# the prediction and our dataset
loss = tf.reduce_mean(tf.square(y-y_data))
# Create an optimizer for our loss function (With gradient descent)
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

With the graph built, our job is create a session to initialize all our graph variables, which in this case is our model parameters. Then we also need to call a session x times to train our model.

# Run session
# Initialize all graph variables
init = tf.initialize_all_variables()
# Create a session and initialize the graph variables (Will acutally run now...)
session = tf.Session()
# Train on 8 steps
for step in xrange(8):
# Optimize one step
# Get access to graph variables(just read) with session.run(varName)
print("Step=%d, loss=%f, [W=%f b=%f]") % (step,session.run(loss),session.run(W),session.run(b))
# Just plot the set of weights and bias with less loss (last)
plt.plot(x_data, y_data, 'ro')
plt.plot(x_data, session.run(W) * x_data + session.run(b))
# Close the Session when we're done.

Loading data

Is almost entirely up to you to load data on tensorflow, which means you need to parse the data yourself. For example one option for image classification could be to have text files with all the images filenames, followed by it's class. For example:


image1.png 0
image2.png 0
image3.png 1
image4.png 1
image5.png 2
image6.png 2

A common API to load the data would be something like this.

train_data, train_label = getDataFromFile('trainingFile.txt')
val_data, val_label = getDataFromFile('validationFile.txt')
## Give to your graph through placeholders...


Tensorflow offers a solution to help visualize what is happening on your graph. This tool is called Tensorboard, basically is a webpage where you can debug your graph, by inspecting it's variables, node connections etc...

In order to use tensorboard you need to annotate on your graph, with the variables that you want to inspect, ie: the loss value. Then you need to generate all the summaries, using the function tf.merge_all_summaries().

Optionally you can also use the function "tf.name_scope" to group nodes on the graph.

After all variables are annotated and you configure your summary, you can go to the console and call:

tensorboard --logdir=/home/leo/test

Considering the previous example here are the changes needed to add information to tensorboard.

1) First we annotate the information on the graph that you are interested to inspect building phase. Then call merge_all_summaries(). On our case we annotated loss (scalar) and W,b(histogram)

# Create our linear regression model
# Variables resides internally inside the graph memory
#tf.name_scope organize things on the tensorboard graphview
with tf.name_scope("LinearReg") as scope:
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0), name="Weights")
b = tf.Variable(tf.zeros([1.0]), name="Bias")
y = W * x_data + b
# Define a loss function that take into account the distance between
# the prediction and our dataset
with tf.name_scope("LossFunc") as scope:
loss = tf.reduce_mean(tf.square(y-y_data))
# Create an optimizer for our loss function
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)
#### Tensorboard stuff
# Annotate loss, weights and bias (Needed for tensorboard)
loss_summary = tf.scalar_summary("loss", loss)
w_h = tf.histogram_summary("W", W)
b_h = tf.histogram_summary("b", b)
# Merge all the summaries
merged_op = tf.merge_all_summaries()

2) During our session creation we need to add a call to "tf.train.SummaryWriter" to create a writer. You need to pass a directory where tensorflow will save the summaries.

# Initialize all graph variables
init = tf.initialize_all_variables()
# Create a session and initialize the graph variables (Will acutally run now...)
session = tf.Session()
# Writer for tensorboard (Directory)
writer_tensorboard = tf.train.SummaryWriter('/home/leo/test', session.graph_def)

3) Then when we execute our graph, for example during training we can ask tensorflow to generate a summary. Of course calling this every time will impact performance. To manage this you could call this at the end of every epoch.

for step in xrange(1000):
# Optimize one step
# Add summary (Everytime could be to much....)
result_summary = session.run(merged_op)
writer_tensorboard.add_summary(result_summary, step)

Results on tensorboard

Here we can see our linear regression model as a computing graph.

Bellow we can see how the loss evolved on each iteration.

Sometimes ipython hold versions of your graph that create problems when using tensorboard, one option is to restart the kernel, if you have problems.

Using GPUs

Tensorflow also allows you to use GPUs to execute graphs or particular sections of your graph.

On common machine learning system you would have one multi-core CPU, with one or more GPUs, tensorflow represent them as follows

  • "/cpu:0": Multicore CPU

  • "/gpu0": First GPU

  • "/gpu1": Second GPU

Unfortunately tensorflow does not have an official function to list the devices available on your system, but there is an unofficial way.

from tensorflow.python.client import device_lib
def get_devices_available():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos]
['/cpu:0', '/gpu:0', '/gpu:1']

Fix graph to a device

Use the "with tf.device('/gpu:0')" statement on python to lock all nodes on this graph block to a particular gpu.

import tensorflow as tf
# Creates a graph.
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True, this will dump
# on the log how tensorflow is mapprint the operations on devices
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
[[ 22. 28.]
[ 49. 64.]]

Multiple Gpus and training

Now we will explain how training is one on a multiple GPU system.

Baiscally the steps for multiple gpu training is this:

  1. Separate your training data in batches as usual

  2. Create a copy of the model in each gpu

  3. Distribute different batches for each gpu

  4. Each gpu will forward the batch and calculate it's gradients

  5. Each gpu will send the gradients to the cpu

  6. The cpu will average each gradient, and do a gradient descent. The model parameters are updated with the gradients averaged across all model replicas.

  7. The cpu will distribute the new model to all gpus

  8. the process loop again until all training is done