# TensorFlow (Beginner): Avoid Overfitting Using Regularization

Welcome to this project on how to avoid overfitting with regularization. We will take a look at two types of regularization techniques: one is called weight regularization and another one is called regularization with dropouts. We will reduce overfitting in our neural network models by using these regularization techniques.

Duration (mins)

Learners

#### 5.0 / 5

Rating

We will cover the following tasks in 56 minutes:

### Introduction

When we train neural network models, you may notice the model performing significantly better on training data as compared to data that it has not seen before, or not trained on before. This means that while we expect the model to learn the underlying patterns from a given data-set, often the model will also memorize the training examples. It will learn to recognize patterns which may be anomalous or may learn the peculiarities in the data-set. This phenomenon is called over-fitting and it’s a problem because a model which is over-fit to the training data will not be able to generalize well to the data that it has not seen before and that sort of defeats the whole point of making the model learn anything at all. We want models which are able to give us predictions as accurately on new data as they can for the training data.

### Importing the Data

We will be working with the popular Fashion MNIST dataset in this project. This is readily available with Keras. We can use a handy helper function called load_data to unpack all the examples to two tuples - each with two arrays. The first tuple has the training set and the second one has the test set. The Fashion MNIST dataset has 28 by 28 pixel images. The labels are simply digits from 0 to 9 for the 10 classes in this dataset.

### Processing the Data

In this task, we will process our data. First, let’s convert the labels to their one-hot-encoded representations. Currently the labels are digits from 0 to 9 for the 10 classes with each digit representing a unique class. We will convert the labels so that each label is a 10 dimensional label vector. In this one-hot-encoded representation, each dimension in the label vector represents a class. We also need to reshape the examples from 28 x 28 arrays to 784 dimensional vectors. This type of unrolling is done simply to make it easy for us to feed the examples to the neural network models.

### Regularization and Dropout

One of the reasons for overfitting is that some of these parameter values can become somewhat large and therefore become too influential on the linear outputs of various hidden units and subsequently become too influential on the non-linear outputs from the activation functions as well. And it can be observed that by regularizing the weights in a way that their values don’t become too large, we can reduce the overfitting. In dropouts, by randomly removing certain nodes in a model, we are forcing the model to NOT assign large values to any particular weights - we are simply forcing the model to NOT rely on any particular weight too much. So, the result is, much like the weight normalization, that the values for weights will be regularized and will not become too large thereby reducing overfitting.

### Creating the Experiment Part 1

In this task and in the next one, we will setup a simple experiment. We will create a model, train it on the Fashion MNIST dataset, then we will display its training performance on training set and on the test set which we will use for validation. We will write some functions to help us with running this experiment. We are first writing these functions to make it easier for us to just run the whole experiment in just one go since we need to run the experiment a few times - first for a model without any regularization, then for a model with the two regularization techniques applied. We will write a function to create a Sequential model with a couple of hidden layers, and one output layer.

### Creating the Experiment Part 2

Let’s create a function that will use such a history object to display the training accuracy and the validation accuracy. Let’s write one function to actually run the whole experiment. When we run this function it should do a few things:

1. Create a model using the create_model function.
2. Train the model.
3. Display the training and validation accuracies by calling the show_acc function. Additionally, I will setup a simple logger callback because I don’t want to see the whole console log outputs as the model trains, that takes too much space so i’ll set the verbose parameter to False and use the simple logger to display JUST the epoch number for each epoch.

### Results

Now that your training is now complete, you should be able to see the training accuracy and the validation accuracy. Your results will be different from mine but the overall image should still be similar. The training accuracy keeps increasing as we train for more epochs and reaches well over 90% but the validation accuracy pretty much remains the same at about 86%. This is a clear case of overfitting. The overfitting problem is solved by using our two regularizers.

## Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.

## Frequently Asked Questions

##### How is this different from YouTube, PluralSight, Udemy, etc.?
In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
##### Is this session really free?
Absolutely! Your host (Amit Yadav) has provided this session completely free of cost!
##### How do I create my own projects like this?
You can go to https://rhyme.com, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at help@rhyme.com
##### Can I buy Rhyme sessions for my company or learning institution?
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at help@rhyme.com
##### What kind of accessibility options does Rhyme provide?
Rhyme strives to ensure that visual instructions are helpful for reading impairments. The Rhyme interface has features like resolution and zoom that will be helpful for visual impairments. And, we are currently developing a close-caption functionality to help with hearing impairments. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. If you have questions related to accessibility, please email us at accessibility@rhyme.com
##### Why don't you just use containers or virtual browsers?
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
##### I have a different question
Please email us at help@rhyme.com and we'll respond to you within one business day.

## More Projects by Amit Yadav

38 minutes
55 minutes
###### Practical Deep Learning - Neural Networks Part 1 - Logistic Regression
1 hour and 4 minutes
###### TensorFlow (Beginner) - Basic Image Classification
1 hour and 10 minutes
###### TensorFlow (Advanced): Neural Style Transfer
1 hour and 5 minutes
46 minutes
48 minutes