Tensorflow 9: MNIST 2

My previous post. .  All my code can be found on github (9_MNIST_2.ipynb)

In the last guide we had a single hidden layer Neural Network and achieved 86% accuracy.

Now we want to try and increase the accuracy by adding an additional hidden layer.

We will have the exact same network as previously except we will now have 2 hidden layers of 256 nodes each.

Network Architecture

Firstly the network parameters are described then the training parameters.  Then i will create weights and biases for the different layers.  Then link them all up.

We train this the same as previously with the exact same code and using the same accuracy code we now get 95%.  That’s an good accuracy increase.

Simplify Code

The code is starting to balloon out and look complicated for what is quite a simple network.  Imagine how it would look with 8 layers or 15.

Lets use some built in higher level Tensorflow operations to simplify the code.  Rewrite it as follows:

Result:

The tf.contrib.fully_connected creates a fully connected layer where you define the input tensor, output size and activation function and it links everything up for you. This is great if you are not doing anything unusual.  In the image below from Tensorboard you can see the weights, biases, matmul, addition and activation relu are all automatically generated.

These can also handle max pooling and convolutional layers we will learn about soon.

Adding many more layers

Since the accuracy improved by adding more layer could we just keep on adding more nodes and layers?  There are some issues to consider with this approach. Firstly the more layers and nodes you add, the more weights and biases you need to store and train. This means more memory usage and more training time required.  And simply adding more nodes and layers will not guarantee results or improvements.

Lets add more layers and see the effect it has.  Lets go with 5 layers

Lets go with 10 layers with more training (100,000).

Since the top MNIST answers have well over 99% accuracy we must be missing something.  Soon we will introduce Convolutional Neural Networks (CNNs) which will increase our accuracy even more and can be used for even harder image recognition problems like CIFAR-10 and CIFAR-100.

Tensorboard monitoring

To change the code to track values via Tensorboard:

Firstly add histogram data of the fully connected layers.

I also want to track how the loss changes

Merge summaries and create the writer

Change training to compute the summaries:

Now i can see how the scalar value of loss changes with steps.

Here i have Histogram values of fc1.  The values of the fully connected layer 1.

Problem – Random input

Because the system assumes every input matches a label output this means if you input random data Softmax will still predict that it is a number.  Even if the input is junk (random values).

Below I input random data and a label of ‘0’.  The system thinks its a 2.