Logo

dev-resources.site

for different kinds of informations.

Getting started with AWS DeepRacer!!

Published at
11/13/2024
Categories
aws
awsdeepracer
reinforcementlearning
machinelearning
Author
Mayur Bhatti
Getting started with AWS DeepRacer!!

AWS DeepRacer is an integrated learning system for users of all levels to learn and explore reinforcement learning and to experiment and build autonomous driving applications. This blog will guide you to build your model on the new Deepracer Console. This can help you get ready for the Deepracer leagues at your nearest summits and also for Virtual circuits that take place throughout the year.

What is AWS DeepRacer?

DeepRacer Logo

AWS DeepRacer is a fully autonomous 1/18th scale race car driven by reinforcement learning. It lets you train your model on AWS DeepRacer Console. It also helps you to provide a Reward Function to your model that indicates to the agent (DeepRacer Car) whether the action performed resulted in a good, bad or neutral outcome.

The AWS DeepRacer console is a graphical user interface to interact with the AWS DeepRacer service. You can use the console to train a reinforcement learning model and to evaluate the model performance in the AWS DeepRacer simulator built upon AWS RoboMaker. In the console, you can also download a trained model for deployment to your AWS DeepRacer vehicle for autonomous driving in a physical environment.

Types Of Race League:

Type of League

There are three main racing formats in the AWS DeepRacer League:

  • Head-to-Head Racing: In this mode, racers navigate the track while avoiding other AWS bot cars. The top racers qualify for a head-to-head elimination bracket at the end of each month.
  • Object Avoidance: In this format, participants must complete the track while dodging obstacles. The fastest time wins and advances toward the Championship Cup.
  • Time Trial: Here, racers aim to complete the track within a set number of laps while staying on track. The fastest racer in this category advances to the Championship Cup.

Understanding Reinforcement Learning:

In reinforcement learning (RL), an agent (in this case, the AWS DeepRacer car) interacts with an environment (the track) and learns by receiving rewards for taking certain actions that lead to desirable outcomes. The agent’s goal is to maximize these rewards by identifying the best actions over time.

Key Reinforcement Learning (RL) Terms:

RL terms

  • Agent - The agent is represented by the AWS DeepRacer vehicle that needs to be trained. More specifically, it embodies the neural network that controls the vehicle, taking inputs, and deciding actions.
  • Environment - The environment contains a track that defines where the vehicle can drives. The agent explores the environment to collect data to train the underlying neural network.
  • State - The state represents a snapshot of the environment where the agent is in at a point in time. The front-facing camera captures this state on the vehicle.
  • Action - An action is a decision made by the agent in the current state. For AWS DeepRacer, an action corresponds to a vehicle move at a particular speed and steering angle.
  • Reward - The reward is the score given as feedback to the agent when it takes action in a given state. In training the AWS DeepRacer model, the reward is returned by a reward function. In general, you define or supply a reward function to specify what is desirable or undesirable action for the agent to take in a given state.
  • Episodes - An episode is a set of processes until the agent terminated.

Services Used & Architecture:

architecture

AWS DeepRacer leverages several AWS services:

  • AWS DeepRacer Console: Interface for training and evaluating models.
  • Amazon SageMaker: Manages and trains the RL models.
  • AWS RoboMaker: Simulates the virtual track environment.
  • Amazon S3: Stores the model data.
  • Amazon Kinesis: Streams data for live feedback.
  • Amazon CloudWatch: Monitors and logs activity.

Steps to Build Your AWS DeepRacer Model:

Let’s train and evaluate our virtual model! Here are the steps (note that you can follow along here by creating an AWS account):

  1. Create a model in the AWS DeepRacer console.
  2. Set the name, description, and track.
  3. Set your action space (possible actions for the model to take).
  4. Create your reward functions.
  5. Set your hyperparameters.
  6. Start training!

Evaluating the Model:

For an in-depth guide to model evaluation, Check this out

Building an Effective Reward Function:

The AWS DeepRacer reward function takes a dictionary object as the input.

def reward_function(params) :

    reward = ...

    return float(reward)

The params dictionary object contains the following key-value pairs:

{
    "all_wheels_on_track": Boolean,        # flag to indicate if the agent is on the track
    "x": float,                            # agent's x-coordinate in meters
    "y": float,                            # agent's y-coordinate in meters
    "closest_objects": [int, int],         # zero-based indices of the two closest objects to the agent's current position of (x, y).
    "closest_waypoints": [int, int],       # indices of the two nearest waypoints.
    "distance_from_center": float,         # distance in meters from the track center 
    "is_crashed": Boolean,                 # Boolean flag to indicate whether the agent has crashed.
    "is_left_of_center": Boolean,          # Flag to indicate if the agent is on the left side to the track center or not. 
    "is_offtrack": Boolean,                # Boolean flag to indicate whether the agent has gone off track.
    "is_reversed": Boolean,                # flag to indicate if the agent is driving clockwise (True) or counter clockwise (False).
    "heading": float,                      # agent's yaw in degrees
    "objects_distance": [float, ],         # list of the objects' distances in meters between 0 and track_length in relation to the starting line.
    "objects_heading": [float, ],          # list of the objects' headings in degrees between -180 and 180.
    "objects_left_of_center": [Boolean, ], # list of Boolean flags indicating whether elements' objects are left of the center (True) or not (False).
    "objects_location": [(float, float),], # list of object locations [(x,y), ...].
    "objects_speed": [float, ],            # list of the objects' speeds in meters per second.
    "progress": float,                     # percentage of track completed
    "speed": float,                        # agent's speed in meters per second (m/s)
    "steering_angle": float,               # agent's steering angle in degrees
    "steps": int,                          # number steps completed
    "track_length": float,                 # track length in meters.
    "track_width": float,                  # width of the track
    "waypoints": [(float, float), ]        # list of (x,y) as milestones along the track center
}

The reward function that worked for us is:

import math
def reward_function(params):

    track_width = params['track_width']
    distance_from_center = params['distance_from_center']
    steering = abs(params['steering_angle'])
    direction_stearing=params['steering_angle']
    speed = params['speed']
    steps = params['steps']
    progress = params['progress']
    all_wheels_on_track = params['all_wheels_on_track']
    ABS_STEERING_THRESHOLD = 15
    SPEED_TRESHOLD = 5
    TOTAL_NUM_STEPS = 85

    # Read input variables
    waypoints = params['waypoints']
    closest_waypoints = params['closest_waypoints']
    heading = params['heading']

    reward = 1.0

    if progress == 100:
        reward += 100

    # Calculate the direction of the center line based on the closest waypoints
    next_point = waypoints[closest_waypoints[1]]
    prev_point = waypoints[closest_waypoints[0]]

    # Calculate the direction in radius, arctan2(dy, dx), the result is (-pi, pi) in radians
    track_direction = math.atan2(next_point[1] - prev_point[1], next_point[0] - prev_point[0]) 

    # Convert to degree
    track_direction = math.degrees(track_direction)

    # Calculate the difference between the track direction and the heading direction of the car
    direction_diff = abs(track_direction - heading)

    # Penalize the reward if the difference is too large
    DIRECTION_THRESHOLD = 10.0

    malus=1

    if direction_diff > DIRECTION_THRESHOLD:
        malus=1-(direction_diff/50)
        if malus<0 or malus>1:
            malus = 0
        reward *= malus

    return reward

For more examples of reward functions, you can visit this link.

Pricing:

AWS DeepRacer provides a Free Tier, that covers the first 10 free hours to train or evaluate models and 5GB of free storage during your first month. After this, it cost around

  • Training or evaluation : $3.50 per hour
  • Model storage : $0.023 per GB-month

Key Takeaways:

  • You should be careful while choosing the training time of your model. If you train it too much it will overfit the track and won’t be able to perform with slight changes in the environment, whereas on the other hand if your train it less, the model won’t be strong enough to make correct decisions.
  • Your primary focus while building and training the model on the virtual environment should be on the accuracy and reliability of your model and not the speed or lap time of your DeepRacer. As in the Physical racing league, you will be able to accelerate your DeepRacer using their app on the phone.

Featured ones: