AWS DeepRacer: Getting Started

DeepRacer is a simplified reinforcement learning ecosystem that includes a small, physical race car called the DeepRacer, a DeepRacer simulator which runs on Amazon SageMaker and a racing league for the autonomous racing car.

Available On Coursera
AWS DeepRacer: Getting Started

Duration (mins)


NA / 5


Task List

We will cover the following tasks in 34 minutes:

What is AWS DeepRacer

A DeepRacer user only needs to worry about writing high level code for a reward function and the model creation, training and evaluation part is automated. This is a great and super fun way to get started with machine learning.

DeepRacer is also an autonomous racing league. You can submit your models trained in the simulator for various races and if your models do well, you can proceed to a Championship Cup where you download your model to a physical DeepRacer and compete with other DeepRacers for a grand prize.

DeepRacer Action Space

In this system, our simulated DeepRacer is an agent. An agent can take some actions in a given world, the race-track in our case. And the aim of reinforcement learning is to make such an agent learn to make good sequence of decisions in its environment. So, the agent will have repeated interactions with the world and we should be able to make some measure of good-ness of the sequence of decisions taken by an agent. You can define a set of actions that the agent can take at any step. An agent will take multiple steps - that is, a sequence of decisions, as it trains. Initially, it will take some wild guesses on what a good sequence of actions might be because initially, the agent doesn’t know anything about the world at all. This is called exploration. The model through the agent is essentially discovering the world or the environment. This action space right here is the set of actions that the agent can take at any step.

DeepRacer Parameters

The agent, over the course of its training, learns a policy. A policy maps the experiences that the agenet may have had, to actions or decisions that the agent must make. Initially this will be pretty random but over the course of its training, the agent keeps updating the policy to make it better. But in all this, how does the agent know a good sequence of decisions from a bad one? This is where a reward function comes in. This reward function is used to quantify the good-ness of decisions taken by an agent. We will program it to reward the agent for what we consider good behaviour and punish it for what we consider bad behaviour. If our agent goes out of the track and into the grass, that’s bad behaviour and we should punish the agent for this. If our agent manages to finish the entire track - that’s good behaviour and we should reward it.

Reward Function

In order to create an effective reward function, DeepRacer provides us with a bunch of parameters from the environment such as Speed of the agent, the co-ordinates of the agent, how much progress it has made, how many steps has it taken, how close it is to the centre of the track, if all the wheels are inside the track or not and so on. Our main task, as a DeepRacer programmer, is to define this reward function. Essentially, to decide what behaviour to reward and what behaviour to punish. And if we do that job well, the DeepRacer, as it trains, will ultimately learn to make good sequence of decisions to be able to achieve its goal - that is, finishing the track and finishing it quickly. We will try to keep things relatively simple given that this is our first try and also because it’s easy to end up using a lot of these parameters and making your reward function more complex than it needs to be. If you’re not sure, just remember to focus on the long term goals and let the training part take its course and figure out how to achieve those long term goals.

Model Training and Evaluation

Finally, we see the model training process and keep an eye out for the reward plot against time as well as the simulated DeepRacer video stream! Model evaluation will tell us if the model was able to finish laps after training and how quickly it’s able to do that. Based on this evaluation, if you choose, you can submit your trained model to the racing league as well.

Watch Preview

Preview the instructions that you will follow along in a hands-on session in your browser.

Amit Yadav

About the Host (Amit Yadav)

I am a machine learning engineer with focus in computer vision and sequence modelling for automated signal processing using deep learning techniques. My previous experiences include leading chatbot development for a large corporation.

Frequently Asked Questions

In Rhyme, all projects are completely hands-on. You don't just passively watch someone else. You use the software directly while following the host's (Amit Yadav) instructions. Using the software is the only way to achieve mastery. With the "Live Guide" option, you can ask for help and get immediate response.
Nothing! Just join through your web browser. Your host (Amit Yadav) has already installed all required software and configured all data.
You can go to, sign up for free, and follow this visual guide How to use Rhyme to create your own projects. If you have custom needs or company-specific environment, please email us at
Absolutely. We offer Rhyme for workgroups as well larger departments and companies. Universities, academies, and bootcamps can also buy Rhyme for their settings. You can select projects and trainings that are mission critical for you and, as well, author your own that reflect your own needs and tech environments. Please email us at
Rhyme strives to ensure that visual instructions are helpful for reading impairments. The Rhyme interface has features like resolution and zoom that will be helpful for visual impairments. And, we are currently developing a close-caption functionality to help with hearing impairments. Most of the accessibility options of the cloud desktop's operating system or the specific application can also be used in Rhyme. If you have questions related to accessibility, please email us at
We started with windows and linux cloud desktops because they have the most flexibility in teaching any software (desktop or web). However, web applications like Salesforce can run directly through a virtual browser. And, others like Jupyter and RStudio can run on containers and be accessed by virtual browsers. We are currently working on such features where such web applications won't need to run through cloud desktops. But, the rest of the Rhyme learning, authoring, and monitoring interfaces will remain the same.
Please email us at and we'll respond to you within one business day.

No sessions available

More Projects by Amit Yadav

Your First Python Program
Your First Python Program
1 hour and 26 minutes
Linear Regression with Python
Linear Regression with Python
1 hour and 4 minutes