5.0 / 5
We will cover the following tasks in 60 minutes:
Introduction and Setup
Welcome to this end to end deep learning project. We will create a face verification system using Deep Learning and in this first project in a three project series on end to end deep learning, we will create the dataset that we will use to train our model later. Before we get started, we will also installed a couple of packages that we will use in this project.
Downloading the LFW Data
Labeled Faces in Wild or LFW is a popular dataset which has thousands of face images of thousands of celebrities. We are going to use a deepfunneled version of this dataset for our project. The deepfunneled version is pre-processed to some degree where faces are aligned and positioned correctly - making our job a little easier. We will also need the names of all the classes in this dataset. The images in the dataset are already in their corresponding folders in a tar file. In this task, we will extract this tar file once it’s downloaded. We will also take a look at the classes in the
A Look at the LFW Data
Let’s import our text file in the notebook and create a list of all the classes which have more than one image in our dataset. This is because we want to use a loss function called triplet loss which will need at least two image examples of every class. Once we have selected the classes that have more than one example in the dataset, we will also take a look at some of the images just to familiarize ourselves with the image data and also to take a look at the affect of deepfunneling.
Setup GPU and CUDA Drivers
Before we can continue, we will need to add a GPU to our Notebook instance and also install the CUDA drivers. This is because we are using a pre-existing package to detect faces from our images which, in turn, uses the
dlib package which runs on GPU. Working with GCP actually makes it much simpler to setup GPU and CUDA drivers which are notorious to install and setup. So, we are in luck in that sense! Once everything is setup, we will move on to the next task.
TensorFlow GPU and Helper Functions
Once the GPU is setup, we will need to install the
tensorflow-gpu package which is the GPU version of TensorFlow. We need this package for TensorFlow to take advantage of the GPU that we just added to our Notebook instance. Next, we will import a bunch of packages into our Notebook that we will need during this project along with some helper functions from the Keras image pre-processing library. Let’s also create a few directories to store the dataset of NumPy arrays that we plan on creating.
Create Dataset Part 1
We will define a function called
get_image which will extract just the faces out of our image examples given the class name and an image number for that class. This function will return the extracted face images from any given image. We will also resize our cropped or extracted face image. Additionally, we will create a function to plot a single example of image triplets. We will later use this function to print out some of the examples as we convert our triplet image examples to triplets of NumPy arrays.
Create Dataset Part 2
Let’s define the size for our cropped out faces. In this task, we finally create our dataset of NumPy arrays. Inside a
for loop, we will select an anchor, a positive example of the same class as the anchor class and a negative example. The loop runs for the total number of classes which have at least one image example in the dataset, ultimately creating the same number of triplets. Of course, we could create more triplets because many classes has more than 2 examples as well but let’s keep things relatively simple in this project. We are creating a dataset of NumPy arrays after not only cropping the face images, but also after doing pre-processing with the help of Keras’ image pre-processing helper function. This will be a huge time saver when we train the model later.
Let’s take a look at the newly created dataset of the NumPy arrays. Sometimes the face recognition or location detection may fail in the previous step - though it happens rarely. Let’s check how many triplet examples were actually created. We will also load one of the saved NumPy files to make sure our dataset creation worked correctly.
About the Host (Amit Yadav)
I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.