4.5 / 5
We will cover the following tasks in 48 minutes:
In this project, we are looking to solve a regression problem with the help of a neural network model and some data. In a regression task, we train the network to predict a continuous value given a set of input features. This is different from classification - in classification problems, we train a model to predict discreet values or classes. To be able to complete this project successfully, you should have a basic understanding of programming with python. It would also help if you have used Jupyter Notebooks before.
You should have a basic understanding of how neural networks work as well. It’s okay if you don’t understand the math behind neural networks because we are going to use a high level API, but I will assume that have some conceptual understanding of neural networks.
Importing the Data
First of all, we will import the dataset. We will use the popular Pandas library to import and process the data. The
head() function gives us the first 5 rows by default. So, we can see the typical values for our features as given in the dataset. There are 5000 examples in the dataset and it may not be possible for us to go through all of them manually. So, we want to ensure that there aren’t any missing values in the dataset before we start using it. Pandas makes it really simple. We can use the
isna() function to find out any missing or not available values in the dataset.
We can make it easier for optimization algorithms to converge towards minimas faster by normalizing the data before training a model. You may remember that a neural network model is trained with the help of an optimization algorithm that tries to minimize the difference between ground truth or actual labels and model’s predictions. Normalization of data makes it easier for such an optimization algorithm to take gradient steps in the right direction more consistently. Normalization is simply changing the distribution of different features so that the values for different features are in similar ranges.
We will ignore the first column because it’s not a feature and is just a serial number. Then we will normalize the data by using mean and standard deviation of each column. This operation is column-wise by default.
Training and Test Sets
In this task, we will split our Data Frame into the features and the labels and then we will split that data into two sets: one for training and another one for testing. In pandas, we have an option to split data by columns by using
iloc for specifying location. Let’s look at our Data Frame’s head that we printed out in the last task after normalization. Notice only the first 6 columns of the Data Frame are the features, so let’s store them separately as X. We will use a helper function called
train_test_split from SciKit Learn to do this split. The reason we are using this helper function is because not only this will very conveniently split our data in the two sets that we need, it will also randomly shuffle the input data before splitting it. This random shuffle is a pretty good idea: if the observations were recorded or arranged in any kind of order for any column, for example, then that pattern may influence our neural network into finding correlations that depend on how the data was observed or stored rather than finding actual meaningful correlations which generalize well.
Create the Model
Let’s write a function that returns an untrained model of a certain architecture. We are using a simple neural network architecture with just 3 hidden layers. We are going to use the
relu activation function on all the layers except for the output layer. Since this is a regression problem, we just need the linear output without any activation here.
Mean squared error is pretty common for regression problems. Remember, this is the loss function that the optimization algorithm tries to minimize. We are using a variant of stochastic gradient descent algorithm which is called
adam. At a beginner level, it’s okay if you don’t understand the exact math behind this as long as you understand what an optimization algorithm is.
Usually, we don’t know in advance how long we may need to train a model for. Fortunately, we can use an
EarlyStopping callback from Keras to stop the model training if the validation loss stops decreasing for a few epochs. The validation loss is calculated on the test set and not on the training set so it’s a better metric to use to make a decision on stopping the training.
By using early stopping callback, we can be generous with the epoch value here and set it to a very high number and the model will simply stop training when it doesn’t see any improvement in validation loss in 5 subsequent epochs.
Previously, we had stored predictions from an untrained model in a variable called
preds_on_untrained. Now that we have the trained model, let’s store predictions from a trained model on the same test set in another variable. Now, let’s use the
compare_predictions helper function to compare predictions from the model when it was untrained and when it was trained.
The trained model predictions are in green and the untrained on in red. This makes sense. The untrained model predictions are quite random and the trained model predictions are more close to this blue line which is the ground truth or the actual values of prices for this set. So, we can see that after the training, the model does a significantly better job at predictions.
Be careful. I have a Mac and and I don't Type this  (hook parenthese)
This was the fourth course I beta tested for this platform, it takes time to get used to this platform. Its great, once you get used to it.
This project was great! In a future project, I'd love to learn how you might deal with missing values or work with a data set that isn't so clean.
This is my 2nd rhyme course. Although engaging with the material is outstanding, it would be useful to have key terms (e.g. callback) listed to learn prior or during.
my VGA got very hot = protection reset. (old desk, Quadro 2k) I'm thinking in others alternatives. The interaction is very good. Congrats. I liked it. Thank you!
Some matters are better taught when the video is not showing a notebook. With regards to teh cloud desktop, I had some difficulties with the window slicer and occasionally with the keyboard inputs (in particular with backspace).
Both panels barely fit my screen. Also, when the windows moves in videos, it's dizzying
lack of fullscreen is major drawback however. (Or, if there is fullscreen option, UI did not make this obvious).
Videos can be stored locally for use in the future
About the Host (Amit Yadav)
I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.