TensorFlow (Advanced): Simple Recurrent Neural Network

In this project, we will build a Recurrent Neural Network model and train it to take mathematical expressions in string format and understand and calculate them. Computers are already pretty good at maths, so this may seem like a trivial problem but it’s not! The interesting part here is that we will give the model string data and not numeric data to work with. This means that the model needs to infer the meaning of various characters from a sequence of text input and then learn addition from the given data!

Join for $4.99
TensorFlow (Advanced): Simple Recurrent Neural Network

Task List

We will cover the following tasks in 49 minutes:


In this project, we want to create a RNN model and train it to learn the meanings of various characters and understand a simple plus operation. The model needs to infer the meaning of various characters and then learn addition from the given data. RNNs are perfect for solving a problem like this because both the input and output are sequences. So, the model must learn the sequence of the input and then predict a sequence for the output.

Generate Data

We know that the model that we’d create will need numeric values in tensors as input. Basically, we know that we can’t simply input the characters as is - we will have to create a suitable representation of the characters and ultimately of the entire sequences, before we can feed the data to a model. We will do this by converting the characters to one-hot-encoded vectors. The dimension of the vector will be equal to the length of our all_chars list. These are the total number of features we have.

Create the Model

Tthe model that we are making has two sections to it. The first part, which is the encoder, is a single SimpleRNN layer with a bunch of hidden units. The output of this layer will be a single vector representation of the input. To achieve this single vector representation of the entire input, we will use the RepeatVector layer and specify the number of times it should repeat. Then the vector representation of the input is fed into a decoder part of the model. This is another RNN layer which will take the vector representation of the input and generate a predicted sequence. Each time step in the output sequence needs to predict probabilities for the various possible characters that the each time step can have. So, we will use a Dense layer with the softmax activation function to do this. The only tricky part is that we want to encapsulate the layer inside a TimeDistributed layer so that the model knows that we want to apply the Dense layer to individual time steps and the hidden state is different for different time steps.

Vectorize and Devectorize Data

So, we have the model that we’d like to train. We have the data as well but it’s not in the format that we’d want. We want to vectorize the string data so that it can be used with our RNN model. Let’s define a function to vectorize a pair of example and label generated from the generate_data function we wrote before.

Let’s write another function to de-vectorize an example back into string. This is because while we only need the vectorized examples for our model, we will still need to convert some test examples back into human readable format so that we, humans can read them!

Create Dataset

Let’s define a function to create our dataset. We are defining this function because, later when we need another dataset for testing, we can re-use the code. This is simple to do and we will just use a for loop to create one-hot-encoded representations of the randomly generated data from the previously defined generate_data function.

Training the Model

Before we train the model, let’s create a couple of callbacks. A LambdaCallback to simplify our logging. We may end up training the model for a couple of hundred epochs so to simplify the logs during training, we will use a lambda function to print out just the validation accuracy. We also use the EarlyStopping callback. Let’s monitor validation loss and give it a fairly high patience, say 10 epochs. We are also going to use a slightly high batch size to speed up the training and a 20% validation split.

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Amit Yadav

About the Host (Amit Yadav)

I am a Software Engineer with many years of experience in writing commercial software. My current areas of interest include computer vision and sequence modelling for automated signal processing using deep learning as well as developing chatbots.

Frequently Asked Questions

In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Amit Yadav) has already installed all required software and configured all data.
You can go to https://rhyme.com/for-companies, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at help@rhyme.com
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at help@rhyme.com
Please email us at help@rhyme.com and we'll respond to you within one business day.

Ready to join this 49 minutes session?