We will cover the following tasks in 1 hour and 4 minutes:
We will understand the Rhyme interface and our learning environment. You will get a virtual machine, you will need Jupyter Notebook and TensorFlow for this course and both of these are already installed on your virtual machine. Jupyter Notebooks are very popular with Data Science and Machine Learning Engineers as one can write code in cells and use other cells for documentation. Computer vision is a complicated subject but the primary aim is to help machines gain a high-level understanding derived from images and videos. Like the name implies, computer vision is simply helping computer achieve a human-like visual system.
Importing the Libraries
We will be using a pre-trained neural network model called ResNet50 in this course. ResNet50 is a neural network with 50 layers. This is what is normally known as a very deep neural network. This ResNet50 model is already trained on a dataset called ImageNet. This is a very large dataset with millions of images with labels of what objects those images contain. There are thousands of these classes or categories.
Not only we can download this trained ResNet50 model, Keras comes with an existing method to preprocess input data that can be fed to the trained model. We will import this method along with two more methods from Keras’ image preprocessing library. These two methods will help us load images and convert images to NumPy arrays. And of course, we will also need NumPy - which is the fundamental package for scientific computing in python.
Importing the ResNet50 Model
Importing the ResNet50 model is also really straight-forward. Keras provides a simple way of doing that. Once we have imported ResNet50, we will need to create an instance of this model. Weights are internal parameter values that the neural network learned when it was trained on the ImageNet dataset. Setting include top to True means that the first, fully connected layer of the model will be included.
Preparing the Images
We will need to preprocess these images so that they can be used by the ResNet50 model to make predictions on them. We already imported the methods that we would need to do this type of preprocessing in a previous chapter. We will define a function called prepare images. This will take the image paths as a parameter. Essentially, what we need to do here is create an images array by iterating through all the images from our image paths. Then, we will convert the array into a NumPy array. This is because the preprocess input function from ResNet50 expects the images to be laid out in a NumPy array. And finally, we simply return the output of the preprocess input method. This is the output that we will use to feed into our model.
We have already defined the prepare images method, so we just need to use it now. We will use the predict method available in our model and that will return an array of predictions on the data that we feed the model. However, the predictions are encoded and have probability values for thousands of classes. So, that makes it a bit confusing for us to use. But as you’d expect by now, Keras gives you a method to decode these predictions. Let’s import this method and apply it to the predictions that we just got. From ImageNet utilities, let’s import the method decode predictions. And we are interested in only the topmost prediction for each image. That is, we want to know what our model thinks is the most likely object in a given image.
Object Detection and YOLO
Object Detection is a very interesting task. With our Neural Network models, not only we can detect which objects are present in any given images but we can also ask the models to localize the objects that it finds. This is called object detection. In this chapter, we will use a high level API to perform object detection using the very popular YOLO algorithm.
Object Detection in Images
We will use a ImageAI Library to perform YOLO Object Detection on an image with two objects in it. The model is going to be Tiny YOLO version 3.
Object Detection in Video
In this chapter, we will learn how to use the ImageAI Library to perform object detection in videos! We have a small 10 second clip of cars moving on the road and we will use Tiny YOLO v3 to perform object detection on this clip.
About the Host (Amit Yadav)
I have been writing code since 1993, when I was 11, and my first passion project started with a database management software that I wrote for a local hospital. More recently, I wrote an award winning education Chatbot for a multi-billion-revenue company. I solved a recurrent problem for my client where they wanted to make basic cyber safety and privacy education accessible for their users. This bot enabled my client to reach out to their customers with personalised and real-time education. In the last one year, I’ve continued my interest in this field by constantly learning and growing in Machine Learning, NLP and Deep Learning. I'm very excited to share my variety of experience and learnings with you with the help of Rhyme.com.