# TensorFlow (Beginner): Basic Image Classification

Welcome to this project on Basic Image Classification with Keras and TensorFlow. In this project, we will learn the basics of using Keras with TensorFlow as its backend and we will use it to solve a basic image classification problem with a Neural Network. By the end of the project, you’d have created and trained a Neural Network model that, after the training, will be able to predict digits from hand-written images with a very high degree of accuracy and while doing this project, you’d have learned the fundamentals of Neural Networks, TensorFlow and Keras. Duration (mins)

Learners

#### 5.0 / 5

Rating

We will cover the following tasks in 1 hour and 12 minutes:

### Introduction

TensorFlow is an open source machine learning library. It is one of the most popular and widely used machine learning library at the moment. However, working with TensorFlow may seem a bit challenging at first because you’d need to understand a lot of underlying ideas on how computations with Neural Networks work. While it’s great to know those details, getting started with Neural Networks directly with writing computational graphs can be a bit intimidating.

This is where Keras comes in. Keras is a high-level API which can use TensorFlow as its backend and provide the users with a simple to use interface. Keras does the heavy lifting behind the scenes, leaving developers to focus on just the high level details. Some developers may or may not want to use Keras in a production setting but for testing out ideas quickly, it’s an amazing tool.

In order to understand our problem better, we will first import the data that we’d be working with and take a closer look at it. We are going to use the popular MNIST dataset which has lots of images of hand-written digits along with their labels.

So, we have 60000 examples for the training set and 10000 examples for the test set. You will notice that each input x is of the shape (28, 28). What this means is that, for each example, there are 28 rows and 28 columns. Fortunately, we can simply print out these examples using the Pyplot module from Matplotlib.

### One Hot Encoding

We will change the way this label is represented from a class name or number to a list of all possible classes with all the classes set to 0 except the one which this example belongs to - which will be set to 1.

So, now it’s as if our Neural Network will predict which switch is ON out of all the 10 switches instead of trying to predict an actual numeric value. This makes it a classification problem. If we were to try and predict an actual number, like 5 or 7 etc., it would be a regression problem instead whereas we are trying to classify the image examples in our case.

### Neural Networks

In a given network example, we have two hidden layers. The first layer with all the X features is called the input layer and the output y is called the output layer. In this example above, the output has only one “node”. The hidden layer can have a lot of nodes or a very few nodes depending on how complex the problem may be. Here, the both hidden layer have 2 nodes each. Each node is an output of a linear function which takes inputs from the nodes of the preceding layer. All the Ws and all the bs associated with all of these linear functions will have to be “learned” by our algorithm as it attempts to optimise those values in order to best fit the given data. In the hand-written digit classification problem, we will have 128 nodes for two hidden layers and of course we already know that the input is a 784 dimensional vector.

### Preprocessing the Examples

We will create a Neural Network which will take 784 dimensional vectors as inputs (28 rows * 28 columns) and will output a 10 dimensional vector (For the 10 classes). We have already converted the outputs to 10 dimensional, one-hot encoded vectors. Now, let’s convert the input to the required format as well. We will use numpy to easily unroll the examples from (28, 28) arrays to (784, 1) vectors.

Pixel values, in this dataset, range from 0 to 255. While that’s fine if we want to display our images, for our neural network to learn the weights and biases for different layers, computations will be simply much more effective and fast if we normalised these values. In one of the future projects, we will take a look at how this normalisation affects the speed of learning.

### Creating a Model

Creating a Neural Network model with the help of Keras is really simple. We simply use a Sequential class defined in Keras, and add some layers to it. As discussed before, we will use two hidden layers with 128 nodes each and one output layer with 10 nodes for the 10 classes. All the layers are going to be Dense layers. This means, like our examples above, all the nodes of a layer would be connected to all the nodes of the preceding layer i.e. densely connected.

We are instantiating a Sequential model. We pass on a list of layers that we want in our model, in the order that we want them. So, we have two hidden layers with 128 nodes each and one output layer with 10 nodes. We set the input shape on the first hidden layer to correspond to the shape of a single example from our reshaped training and test sets - we know each example is a 784 dimensional vector for the 784 pixels of the images.

### Training the Model

Let’s train the model now. We will use our training set which has been normalised and reshaped! Also, we are going to train the model for 5 epochs. Think of epoch like an iteration of all the examples going through the model. So, by setting the epochs to 5, we will go through all the training examples 5 times.

We get a training set accuracy of over 98%. While this is probably not as good as a human level performance, it still seems quite good for a machine. But, in order to ensure that this is not a simple “memorization” by the machine, we should evaluate the performance on the test set. This is easy to do, we simply use the evaluate method on our model.

### Predictions

Let’s get our model’s predictions on the test dataset. Each prediction is a list of probability scores as we expected from our softmax output. What we are interested in, is the index of the highest probability score in each prediction. We can use numpy’s argmax function to do this.

We have a total of 10000 predictions. We probably can’t go through all the 10000 predictions for now, but we can take a look at the first few. Let’s plot the first few test set images along with their predicted and actual labels and see how our trained model actually performed.

## Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

## Reviews

it is a superb experience eren

Nice job Pi-Yueh Chuang

Love the courses. Bring more! Walter Keith Tomes

This was an amazing learning experience! I wasn't able to complete the project the first time, but I tried it again and was able to complete it successfully. The second time through, I sped up the video during the explanations and then paused it whenever I needed to do the coding portions. It was amazing how much I retained from the first time. When I did the project the second time, I understood everything that was happening and feel like I gained a lot of knowledge. This is so much better than just watching a video. I'm thrilled to have been given this great learning opportunity and want to do more! John Luton

You need to figure out a way to provide a readable view of the teacher's notebook that is wide enough that they don't have to scroll the screen back and forth so much. It is very distracting/nauseating. Also, why can I not save the final state of the notebook out of the cloud desktop onto my local for later review? I don't have a perfect memory and don't want to have to redo the task to get to one line of code or a cell that I need for something else. Finally, when I see an issue, being the nitpicker I am, I want to note/report it then and there; but your system has no feedback button on the learning desktop/page. This might also be useful to collect the "state"/context of a piece of feedback rather than trying to get the user to recall all that info later when they use the "contact us" feature outside the learning environ. great stuff. thanks Brian I am a Software Engineer with many years of experience in writing commercial software. My current areas of interest include computer vision and sequence modelling for automated signal processing using deep learning as well as developing chatbots.

##### How is this different from YouTube, PluralSight, Udemy, etc.?
In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.