Google Cloud AI: End to End Deep Learning Part 2

Welcome to the Google Cloud AI: End to End Deep Learning Part 2! In this series, we are working towards creating a face verification system using Deep Learning. In this second project in a three project series on end to end deep learning, we will use our dataset to train a model to learn face embeddings. We are going to use something called triplet loss as this model’s loss function.

Available Through Coursera
Google Cloud AI: End to End Deep Learning Part 2

Task List


We will cover the following tasks in 1 hour and 7 minutes:


Introduction and Downloading VGG19 Model

Let’s launch our notebook instances that we created in the previous project. Let’s create a new notebook on this instance which we will use for our training process. Then we will download a VGG19 model from Keras along with its weights pre-trained on the imagenet dataset. We will not use any top layer from the VGG19 model and will add our own top layer to get 128 dimensional vectors for our face embedding.


Creating Embedding Model

We have already downloaded the VGG19 weights and architecture that we will base our embedding model on. We will use the Sequential class from Keras to construct our embedding model. The first layer of this model is going to be the previously downloaded VGG19 model. Of course the VGG19 model does not include a top layer in our case. Then we will Flatten the VGG19 output, add a Dense layer with 128 nodes and sigmoid activation function. We are also using a kernel_initializer. At this point, we don’t need to compile our embedding model. We will use this as part of another computational graph later and set the compile settings of that graph.


Triplet Loss

Let’s first import all the indices of triplets that got saved in our previous example. In triplet loss, we are taking a look at three embedding at a time: anchor, positive and negative. Then, we calculate the euclidean distance between the anchor and positive, and between anchor and negative. The distance between the anchor and positive should be much lower than the distance between the anchor and negative. We will also need to do online mining for the hardest negative example for the model: hardest in the sense that they are the hardest for the model to distinguish from the positive examples given an anchor. Based on this, let’s implement our custom loss function!


Triplet Generator

In this task, we will see how to create a custom data generator that can be used while model training. When we have custom datasets which may be too large to load in memory in entirety, we will need to use data generators like this example to read only a small batch of data at a time and train the model on that batch and then read the next batch and so on.

We will randomly select the triplet examples while creating a new batch of training data. The dataset was already pre-processed before we saved it to disk in the last project so we will save some time in training because of that. We don’t need to worry about the labels as we won’t require those for the triplet loss function.


Face Verifier Network

In this task, we will create a new model using Keras’ Model class which we are calling our Face Verifier Network. This is essentially our Siamese Network which will take three inputs for the three image arrays from the triplet generator and will give us a concatenated output for the three embedding vectors. Then, those three embedding vectors will be used to calculate triplet loss as we will set the loss function for this Face Verifier as the previously defined triplet loss function.

Note that the embedding model being used to calculate embedding vectors is being shared in this network. That is, the weights are shared. We will also compile the Face Verifier with the Adam optimizer with lr=1e-5.


Training with Triplet Loss

As we do this training, we will use an Early Stopping callback so that we don’t do any training that is not required and the training process stops when the network seems to stop making any significant improvement. Because of the negative mining concept, the batch size should be higher than what we would normally use. In fact, the Facenet paper uses almost 1800 examples in one batch. We are creating a much smaller version of that implementation and also happen to have lesser memory, so we will use only 16 steps per epoch with a batch size of about 100 examples.


Training History

You will see a general downward trend for your loss values. We will plot the loss values on a graph to take a look at the trend. We will save our model as a Keras .h5 model.


Euclidean Distances

Is there a meaningful difference between the euclidean distances between the anchor and positive examples and between anchor and negative examples. In this task, we will randomly select a few examples and get the embedding vectors as predicted by our embedding model.

We will then define a simple function to calculate euclidean distances between two vectors. Then, we will find the distances between the anchor and positive examples, and between the anchor and negative examples. When these two distances are plotted on a graph, we will notice how the distance values between the anchor and positive examples are much lower, in general, compared to the distance values between the anchor and negative examples.

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Amit Yadav

About the Host (Amit Yadav)


I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.



Frequently Asked Questions


In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Amit Yadav) has already installed all required software and configured all data.
You can go to https://rhyme.com, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at help@rhyme.com
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at help@rhyme.com
Rhyme strives to ensure that visual instructions are helpful for reading impairments. The Rhyme interface has features like resolution and zoom that will be helpful for visual impairments. And, we are currently developing a close-caption functionality to help with hearing impairments. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. If you have questions related to accessibility, please email us at accessibility@rhyme.com
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
Please email us at help@rhyme.com and we'll respond to you within one business day.

No sessions available