TensorFlow (Advanced): Neural Style Transfer

Welcome to Neural Style Transfer with TensorFlow! Neural Style Transfer is a technique to apply stylistic features of a Style image onto a Content image while retaining the Content’s overall structure and complex features. We will see how to create content and style models, compute content and style costs and ultimately run a training loop to optimise a proposed image which retains content features while imparting stylistic features.

Join for Free
TensorFlow (Advanced): Neural Style Transfer

Duration (mins)


NA / 5


Task List

We will cover the following tasks in 1 hour and 5 minutes:


We use a content image and a style image. Then, we write an algorithm to generate a new image, where the algorithm basically tries to retain some features from the content image and apply the style from the style image. So, we get our content image sort of stylised in the style of this style image. This can give some really cool results.

This technique was proposed by L.A. Gatys, Alexander S. Ecker and Matthias Bethge. They wrote a paper called A Neural Algorithm of Artistic Style back in 2015.

Import the Model

Let’s start by importing a model which we will use to perform the Neural Style Transfer. The way the algorithm works is by using a model pre-trained on a large image dataset. The intermediate layers of this pre-trained model work like feature detectors. We will use the output of these intermediate feature detectors and compare that output for say our content image vs a proposed stylised image. This comparison can give us a content cost. Similarly, we can use the output of some of the intermediate layers and compare it with the output of these layers given a proposed stylised image. This can be our style cost. Then, we add the content cost and style cost together. Now, if we run an optimisation algorithm to try and minimise this total cost, and updating the proposed style image along the way, we should get a result which retains some features of the original content image but also imparts the stylistic features of the style image to the proposed stylised image.

The original paper uses VGG19 model and fortunately, it’s easily accessible in TensorFlow.

Import Libraries and Helper Functions

We need to import a few helper functions to import and process our images. This includes the load_img function, the img_to_array function and very importantly, the preprocess_input from the vgg19 submodule from Keras.

We will also use NumPy and MatplotLib.

Image Processing and Display

The helper functions make it really simple for us to load and process our images. Let’s define a function which does all the preprocessing for us using the helper functions. We want to transform an ordinary image into a format the model can understand and work with in an efficient manner. We will take help of the preprocess input method we imported from the vgg19 submodule from Keras.

Let’s write one more function with the intention that we should be able to use it on a generated image, a proposed stylised image, which would be a processed image because we would have processed our input content and style with the function above. Essentially, the results from the intermediate layers are going to be arrays and in order to display those as images, we will have to convert them, de-process them to a human-understandable images:

Content and Style Models

In order for us to compute content cost, we need to take a look at the activations at some intermediate layer. In the VGG19, there are 5 blocks of layers with each block being made up of 2 to 4 convolutional layers followed by one pooling layer. For content cost, we want to use activations from a layer by which layer the features are already well represented so that when we compare this output with the proposed stylised image, these features match in the two images as we try to minimise the overall cost using some optimisation algorithm. More specifically, we will use the block5_conv2 layer.

For the style cost, we can do something similar. We will use 3 different intermediate layers from different blocks to compute our style cost. This is because we want different kind of stylistic features to impact our cost and not just high level or complex features extracted from the style image. So, we will use three convolutional layers from different blocks in VGG19. Some will give us low level, broader understanding of the stylistic features and others will be more complex.

Compute Content Cost

Content cost is quite simple to calculate. We need to find out the output of the content model with both the content image and the proposed stylised image.

Define Gram Matrix

In order to compute style cost, we will need to define what’s known as Gram Matrix. We calculate Gram Matrices for the activations of the style and the generated image and calculate the style cost by finding the mean squared difference between these two matrices. Gram Matrices give us a strong feature correlation. And, you could try using other techniques here but the original paper on Neural Style Transfer uses Gram matrices so that’s what we are gonna use as well. But the fundamental idea here is that we are going to use these matrices to match feature distribution as opposed to presence of specific features.

Compute Style Cost

We have a bunch of style models each corresponding to a different intermediate layer from the VGG19 model. Our total style cost is going to be weighted sum of the costs for each of the models.

Training Loop

In order to generate a stylised image, we now need to follow these steps:

  1. Initialise the content image, the style image and also store our initial content image in another variable because we will use this to compute content cost as we update the content image.
  2. Instantiate an Optimiser. We are going to use the Adam Optimiser.
  3. Run the training loop for a given number of iterations:
    1. Compute the total cost for each iteration by calculating the Content Cost and the Style Cost.
    2. Calculate the gradients of the cost with respect to the generated image using gradient tape.
    3. Update the gradients.
    4. Save the lowest cost and the generated image associated with the step with the lowest cost. Sometimes the cost may start to increase after hitting a minima, so we want to ensure that we save the image with the lowest cost during all the iterations in a separate variable that we can use later.

Plot the Results

Now that the training loop is complete, we will take a look at the best image! If you run this loop for about 100 times, you will start to get even better results but I’d say for just 20 iterations, the algorithm actually spits out something that kinda looks like a painting of the Exeter Cathedral in the style of The Great Wave painting or at least an attempt at that. And like I said before, if you run this algorithm for longer, you can get more interesting and probably aesthetically better results.

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Amit Yadav

About the Host (Amit Yadav)

I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.

Frequently Asked Questions

In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Amit Yadav) has already installed all required software and configured all data.
Absolutely! Your host (Amit Yadav) has provided this session completely free of cost!
You can go to https://rhyme.com, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at help@rhyme.com
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at help@rhyme.com
Rhyme strives to ensure that visual instructions are helpful for reading impairments. The Rhyme interface has features like resolution and zoom that will be helpful for visual impairments. And, we are currently developing a close-caption functionality to help with hearing impairments. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. If you have questions related to accessibility, please email us at accessibility@rhyme.com
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
Please email us at help@rhyme.com and we'll respond to you within one business day.

Ready to join this 1 hour and 5 minutes session for free?

More Projects by Amit Yadav

Your First Python Program
Your First Python Program
1 hour and 26 minutes
Linear Regression with Python
Linear Regression with Python
1 hour and 4 minutes