Computer Vision with TensorFlow: Transfer Learning

In this course, we will use a pre-trained ResNet50 model, which was trained on the ImgaeNet dataset to classify images in one of the thousand classes in the dataset, and apply this model to a new problem: We will ask it to classify between two classes from a new dataset. We can use the knowledge gained by our ResNet50 model during its training on the ImageNet dataset and apply it to solve another problem, though a related problem. This way, you can create new models to solve specific problems without having to create and train a complete deep learning model. Instead, you will train just the last layer or layers and all of the heavy-lifting will be done by a pre-trained model.

Available On Coursera
Computer Vision with TensorFlow: Transfer Learning

Duration (mins)


5.0 / 5


Task List

We will cover the following tasks in 51 minutes:


We will understand the Rhyme interface and our learning environment. You will get a virtual machine, you will need Jupyter Notebook and TensorFlow for this course and both of these are already installed on your virtual machine. Jupyter Notebooks are very popular with Data Science and Machine Learning Engineers as one can write code in cells and use other cells for documentation. In this project, we will use the concept of transfer learning and apply it to a binary image classification problem. We will create a model that will be able to distinguish between Glasses and Tables. This dataset is available on GitHub and was published by Turkish Engineer Muhammed Buyukkinaci.

Importing Libraries

Start by importing the ResNet50 model. We will also need the Sequential class from Keras. We will also import the Dense layer from Keras. We will use this to create a new output layer. And as always, we will import NumPy - the fundamental package for scientific computing in Python.

Create and Compile a New Model

The first layer is going to be our ResNet50 model WITH weights set to the pre-trained weights from ImageNet training. We will set include top to false. This means that the final dense layer with the 1000 outputs is not going to be used. Instead, we will apply global average pooling to the output of the last convolutional layer in the ResNet model and that will make our output of the ResNet layer a 2D tensor. Then, we will add the output layer, which is a dense layer, which has just 2 nodes. Technically, we don’t really need the 2 nodes either in binary classification and just 1 node will do. But, for this example, we will use 2 nodes and you can apply the same concept to multi-class classification problems as well.

We will use the categorical cross entropy for our loss function and we will use the common stochastic gradient descent algorithm for the optimizer. We will use only one training metric and that will be accuracy.

Image Data Generator

We will use the ImageDataGenerator class from the image preprocessing library of Keras. We will create two data generators with the help of this class - one for our training set and another one for our validation set. We will set the preprocessing function of the ImageDataGenerator class to the Resnet50 preprocess_input method that we imported earlier.

We will create a data generator for the training dataset and another one for the validation dataset. We can use flow_from_directory method and specify the data directory, the target size, and the class mode for each data generator.

Train the Model

The first layer, which is the ResNet50 layer, is already trained in the ImageNet dataset and we are not training that layer. We are only asking the model to learn the 4098 parameters it needs to learn for all the connections between our output layer and the last layer of the ResNet50 model.

Since we are using the image data generator, we need to use a variant of the fit method called fit_generator.

Specify the train generator for the training data, validation generator for validation data, validation steps, steps per epoch. And of course, specify epochs - which is an iteration over the entire data provided.

Prepare Test Images

Let’s define a function called prepare images. This will take paths to test images as input. This function will first create an array of images. We will use Keras’ image preprocessing and load two helper functions from there. One function to help us load images and other one to help us convert those images to NumPy arrays. Such a NumPy array can then be fed into our preprocess input method that we imported from Keras’ ResNet module earlier. This will create a suitable format to be fed into our model.

Make Predictions on Test Images

We will create a function which will show the prediction for every test image that we pass into it along with a prediction on that image. Since we have only two classes, the prediction will simply be if the given image is a glass or a table.

We will use the prepare images function which will return test data in an appropriate format. Then we make the predictions by using the predict method on our model. Next, we just have to print out the prediction. But remember, we will need to get the index of the highest confidence score and we will print out the corresponding class name.

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Amit Yadav

About the Host (Amit Yadav)

I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.

Frequently Asked Questions

In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Amit Yadav) has already installed all required software and configured all data.
You can go to, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at
Rhyme strives to ensure that visual instructions are helpful for reading impairments. The Rhyme interface has features like resolution and zoom that will be helpful for visual impairments. And, we are currently developing a close-caption functionality to help with hearing impairments. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. If you have questions related to accessibility, please email us at
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
Please email us at and we'll respond to you within one business day.

No sessions available