Google Cloud AI: Generative Adversarial Network

Hi and welcome to the second project in our Google Cloud AI Platform series where we will learn using Cloud AI Platform and ultimately build an end to end deep learning application on it. In this project we are creating and training a Generative Adversarial Network (GAN) to synthesize new MNIST images. So, we will create a model that will learn to create realistic images of hand-written digits as it learns from actual hand-written digits from the MNIST dataset. These generated images will be fake images in the sense that a machine is writing them and they aren’t generated by humans as is the case with the original MNIST images. This is the same technique that you may have seen in the news that is being used to create fake images, videos or audio, or what are called deep fakes. Even though there’s some negative conotation with creating deep fakes, the underlying technique can be very useful. Just as a simple example, imaging how cool it’d be if, say, you are a game designer and you can just get realistic looking images for your characters, textures or even entire scenes just at the click of a button. It’s not far from reality at this point and in this project, we will learn the same technique as we generate the fake images of the hand-written digits.

Join for $9.99
Google Cloud AI: Generative Adversarial Network

Task List


We will cover the following tasks in 1 hour and 8 minutes:


Introduction & Importing Libraries

Unlike the previous series on TensorFlow on Rhyme, this series on the Cloud AI Platform must be taken in a sequence because even though we are looking at a new concept in this project, a lot of things related to the Cloud platform are going to be in continuation so not everything may make sense to you if you take this project independently of the series. So, just in case this is the first project you are doing in the series, I’d recommend that you go back and start the series from the beginning with the first project and then come back to this one.

We want to start leveraging GPUs for our model training from now on. This is because CPUs can be quite slow and limited in performance when working with Tensors but GPUs do much much better. In fact, in most cases, you can improve the speed of your computations by 100 times or more. So, certain deep learning models just aren’t trainable JUST on CPUs for that reason. And one really cool thing about the Google Cloud AI Platform is that it allows you to provision different types of GPUs for your Notebook instances. Rather for the Virtual Machines that your Notebook instances are using.


Importing & Processing the Data

We have worked with the MNIST dataset a few times before, so you should be quite familiar with it by now. Let’s import it. Even though we are quite familiar with the dataset, let’s take a look at some of the images. Note that we are going to create a Convolutional Network for our GAN, so we will need channel information on the image examples.


Generator Model

The idea is that the generator will have to become better and better as it trains to try and fool the discriminator. In turn, the discriminator keeps learning as well and tries to be able to recognise the fake images even if they seem like the real images. Over time, as both models train, they both keep improving and ultimately we get to a point where the generator starts creating very real-like images.

So, we will start by creating a generator model. We will start by taking a random sample of 100 values and our generator model should be able to create a new 28 by 28 image out of that input. This means that this model will scale up the information and fill in the blanks to generate a lot more information from the given input.


Discriminator Model

Now we want to create a model which will be able to distinguish between real and fake images as it trains. This would look more like a regular convolutional network that we are used to. The model should take a 28 by 28 image as input and give us a binary yes or no output. While the generator will be trained as part of the final GAN model that we are yet to create, the discriminator will be trained independently even though it will still be a part of the GAN model. I know it may sound a bit confusing, but it will become clearer in the next task.


Generative Adversarial Network

Now that we have the generator and the descriminator, all we have to do in order to create the GAN is basically just connect the two togerher. As always, we will use Keras’ functional API to do this. Once small detail to note here is that when we train this GAN later, we essentially want to train JUST the generator on a given batch. The discriminator must be trained separately of course but when it’s used as part of the GAN, it does not undergo any training and is used just to, well, find out how many fake and real images are there in the input. So, to that effect, first thing we will do is set the discriminator to not-trainable. We will change it later in the training loop but only on the part where the discriminator is independently trained.


Plotting Function

When we train the GAN, we might want to take a look at the generated images at every epoch. This will help us see how the generator is performing. So, let’s create a function called plot and we will call it later from the training loop. Of course the images right now are just random noise because the generator isn’t trained yet. Let’s do the training in the next task.


Training the GAN

We have created the GAN and now all we have to do is train it and take a look at the results! What we want to do here is first generate a bunch of noise vectors to feed into the generator. Then, combine the generated images along with some real images from the MNIST dataset. We will also create labels: 0 for the fake images and 1 for the real images. This will give us the training set for the discriminator to learn from. There’s a small detail here I’ll get to when writing the code.

Then we will again generate some noise vectors and input to the GAN this time. This time we will set all the labels to TRUE - forcing the generator to try and figure out a way to best beat the currently trained discriminator. Note that we won’t update or train the discriminator at this stage when training the GAN. The training will take a little bit of time. Specifically, if you’re using the same K80 GPU that I am using, you should see every epoch taking between 90-100 seconds.

Also, if you run the plot function - you get a few new images from the generator. A lot of times when you see super realistic deep fakes, you have to understand that not every single example generated by a GAN generator is going to be a realistic one even after a long training with a lot of data. But some examples will be very very close to the real images. So, often researchers will display the best results which are hand-picked and not ALL the generated results and that may make it seem like the generators are a bit better than they actually are. This is still a pretty amazing technique of course and one that I hope will be useful to you in future.

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Amit Yadav

About the Host (Amit Yadav)


I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.



Frequently Asked Questions


In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Amit Yadav) has already installed all required software and configured all data.
You can go to https://rhyme.com/for-companies, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at help@rhyme.com
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at help@rhyme.com
Rhyme's visual instructions are somewhat helpful for reading impairments. The Rhyme interface has features like resolution and zoom that are slightly helpful for visual impairment. And, we are currently developing a close-caption functionality to help with hearing impairment. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. However, we still have a lot of work to do. If you have suggestions for accessibility, please email us at accessibility@rhyme.com
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
Please email us at help@rhyme.com and we'll respond to you within one business day.

Ready to join this 1 hour and 8 minutes session?

More Projects by Amit Yadav


Amazon SageMaker: Custom Scripts
Amazon SageMaker: Custom Scripts
1 hour and 14 minutes