We will cover the following tasks in 1 hour and 14 minutes:
When the project launches, you should see Chrome automatically launch in your cloud destop. You will see the Sagemaker home page. Let’s log into our AWS accounts. You can pause my video at this point, log into your AWS management console and resume my recorded video once you are logged in. Once you resume, you will be taken to the next task in which we will create a notebook instance and will start working on our custom model project.
Create Notebook Instance
Let’s create a notebook instance in the Sagemaker Notebooks. We will use the default settings. Once the instance is ready, we will access it using Jupyter Notebooks and create a notebook for our project.
Custom Training Script
We will need to create a Sagemaker Estimator just like any other Sagemaker project even if we are using custom models. The difference is that we will write a custom model and training script and provide that as an entry point to our Estimator instance. Let’s call this script
train.py. When we train such an Estimator later, Sagemaker will automatically create a Tensorflow environment, set up the environment variables and execute the script.
In this task, we are going to define a model function which creates and returns a TensorFlow CNN model that we will use for our training. If you have seen a few deep learning examples before, you have definitely come across the MNIST dataset. That’s the dataset that we’d be using in our project as well. The MNIST dataset has 28x28 grayscale images of hand-written digits as examples and their corresponding numeric classes as labels. Since we already created the
train.py file in the previous task, we will simply append the model function in it.
Second Convolution Block
We continue to work on our model function. In this task, we will create the second convolution block similar to the first convolution block of the model.
Wrapping Model Function
Finally, we will define the estimator specifications for various modes: for training, prediction and evaluation. We will specify a loss function as well. Next, we need to write a function which can serve input to the Estimator and we also need an argument parser to read and set a few arguments that we will need from the instance environment that Sagemaker will setup during training or inference.
Now, we will define a few helper functions to help us load training and testing data, parse the arguments given to the script and also serve inputs to the Estimator. Again, we will continue to append the code in our
train.py file. The argument parser reads the environment variables from the environment that Sagemaker sets up and parses those arguments for use in our script.
Serving Input Function
Serving Input Function is simply going to provide an instance of
ServingInputReceiver to our TensorFlow Estimator. We will need to specify the input feature dictionary here. With this done, we have the input server function, helper functions for loading data and of course the argument parser.
Finishing Training Script
Let’s continue working on our training script. Now, we will write the main function where we will create an estimator, then use the helper functions we wrote earlier to load training and evaluation data. TrainSpec determines the input data for the training, as well as the duration. EvalSpec combines details of evaluation of the trained model as well as its export.
Rest of the code that we will write in the project should look familiar if you have done the previous few projects in this Sagemaker series. We will create a Sagemaker session, create a Sagemaker estimator, and use the estimator to train and deploy our model.
Create and Train Estimator
Sagemaker provides us with an Estimator specifically for Tensorflow. The arguments are similar to what you would use for any Sagemaker estimator excpet this time we need to provide an entry point. This is our
train.py script that we created earlier in the project. The training job will take about 6 or 7 minutes for it to complete.
Deploy The Model
We will use the estimator instance to deploy our model. It is really simple to do that in Sagemaker. All we have to do is call the deploy method on the estimator and specify how many and what type of instances are we looking to use. Please note that the training and deployment does accrue some cost if you don’t have free AWS credits. Training job automatically cleans up the resources used during training so once it shows you that the training job is complete, it will clean up the resources used and show you billable seconds in the logs. But for deployed models, you will need to manually delete the resources and the endpoint otherwise you will keep getting billed for using up resources.
Using Endpoint for Inference
Let’s download the test set locally and use to get some predictions from our deployed model. We will use the preprocessed MNIST evaluation data and run inference with it and see that the model is actually trained and performs as expected. Make sure to delete the endpoint to avoid any unnecessary costs.
About the Host (Amit Yadav)
I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.