Logo

dev-resources.site

for different kinds of informations.

Use AWS StepFunctions for SSM Patching Alerts

Published at
4/11/2024
Categories
aws
stepfunctions
ssm
cloudformation
Author
lp
Author
2 person written this
lp
open
Use AWS StepFunctions for SSM Patching Alerts

In this blog post we'll explore how to use AWS Step Functions and SSM Patch Manager to monitor the patch compliance status of EC2 instances and send alerts, reducing manual tracking and enhancing the security of our cloud environment.

AWS Step Functions is a service that doesn't require a server to run. It allows us to connect with Lambda functions and other services to construct important business applications.

The service is built around the principle of linked tasks put together in a workflow called "state machine". A task is able to invoke other AWS services or more recently third-party APIs.

AWS Systems Manager (SSM), which includes the Patch Manager feature, provides a unified interface for managing your AWS resources, including the ability to automate patching for EC2 instances which can make things easier for us.

However, there are instances that need a reboot so the EC2 patch is completely done. We don't want to restart them automatically, especially when they're running critical services like databases. In this case, we prefer to choose when to restart.

To keep track of the EC2 instances that aren't fully compliant in the SSM patch report, we need alerts.

My goal was to send alerts to a Microsoft Teams channel, listing the EC2s that aren't compliant and need additional actions like rebooting. Initially, I used a Lambda function to do this but I didn't want to manage its dependencies over time so I switched to using Step Functions, taking advantage of its new feature that supports HTTPS endpoints.

Overview of the State Machine

Image description

The entire process is initiated by an EventBridge rule, which acts as a trigger for the Step Function state machine.

The state machine begins by identifying all currently active EC2 instances in the AWS account. It then retrieves all the instance IDs and filters them based on parameters such as ComplianceType=Patch and Status=NON_COMPLIANT.

Next it determines if there are any instances that need review. If not, the state machine will skip to the end and stop. To do this, we use a task that counts the number of instances in the list from the previous step. If the count is more than zero, indicating that there are instances requiring attention, the state machine continues to filter these instances by their tags. This information is then used to format a message sent to a Microsoft Teams channel, which includes the names and IDs of the EC2 instances that need our attention.

In the end we call the 3rd party API to send the formatted message to the Microsoft Teams channel.

Full steps description

  1. DescribeInstances: starts the process by identifying all currently active EC2 instances within the AWS account.
  2. ExtractInstanceIDs: retrieves all the instance IDs from the previously fetched list of EC2 instances.
  3. FetchInstanceComplianceData: filters the instances based on the ComplianceType=Patch and Status=NON_COMPLIANT parameters.
  4. CalculateArrayLength: calculates the size of the list of non-compliant instances.
  5. CheckIfInstancesFound: checks if the size of the list is greater than zero (indicating that there are non-compliant instances) or not. If no non-compliant instances are found during this step, the state machine skips to the end state and stops.
  6. DescribeTagsForFilteredInstances: if there are non-compliant instances, this step fetches the tags for these instances.
  7. PrepareNonCompliantInstanceList: prepares a list of non-compliant instances along with their names and IDs.
  8. CallThirdPartyAPI: formats the message with the non-compliant instances' information and sends it to a Microsoft Teams channel.

Additional Configuration Details

This state machine can be used to send the message to any communications tool like Slack, MS Teams or to a ticketing system.

For MS Teams, the endpoint URL needs to be encoded so the "@" needs to be replaced with "%40" or you can use a URL shortener service.

An HTTP Task requires an EventBridge connection, which securely manages the authentication credentials of an API provider. A connection specifies the authorization type and credentials to use for authorizing a third-party API.

In our case we are just sending a message/payload to an external URL without the need of authentication but in order to use the StepFunction HTTP Task, we need to create this connection. When creating the connection, as a requirement, you also create an AWS Secret used for authentication. Again, since there's no need to authenticate to the MS Teams channel, the Secret values contain the keyname of the API and the secret is the ARN of the EventBrdige API connection:

Image description

Image description

Image description

Infrastructure as Code

The entire PoC was done using the AWS Console but since we are living in the age of automation, I wanted to have an easy and repeatable way of deploying the solution.

In the past weeks, the Cloudformation service team announced the new IaC generator (infrastructure as code generator) which must be one of the most desired features for years now so I definitely wanted to give it a try.

It turns out that I was able to get the Cloudformation template for all the needed resources pretty easy. The hardest thing was to select from a huge dropdown list all the resources involved in my scenario and to make sure I don't leave out any. After the template was generated, inside the StepFunction JSON definition, it was a bit difficult to replace the hardcoded values with parameters. Now it seems like a piece of cake.

If you want to use this solution, checkout the Github repo which includes the entire Cloudformation stack needed for deployment.

During the initial stages of this PoC, I encountered difficulties with the CallThirdPartyAPI task so I asked around for guidance other AWS CommunityBuilders in the dedicated Slack space and got almost instant help from Benoît Bouré, Jimmy Dahlqvist and Andres Moreno. Chapeau bas!

ssm Article's
30 articles in total
Favicon
The re-re-rebirth of AWS Systems Manager
Favicon
How can I enforce MFA before switching roles and using SSM login in AWS?
Favicon
EC2 instance deployment unification across AWS Organizations
Favicon
ECS Exec Usage Guide
Favicon
Gerenciamento de alta latência com AWS CloudWatch e AWS Systems Manager
Favicon
How to — AWS Auto Stop/Start of EC2 Instances using Tags
Favicon
Use AWS StepFunctions for SSM Patching Alerts
Favicon
Port Forwarding to Amazon MQ
Favicon
NestJS Configuration Secrets Made Easy with configify
Favicon
No-ssh deployment to EC2 using ansible and AWS Systems Manager
Favicon
Automating patching with AWS Systems Manager
Favicon
[AWS] How To Install Cloud Watch Agent To EC2 Linux With SSM
Favicon
Create a Secure VPC with SSM-Managed Private EC2 Instances Using the AWS CLI
Favicon
Stop/Start RDS Instances Automatically Using System Manager for Cost Optimization
Favicon
How to debug running CodeBuild builds in AWS Session Manager
Favicon
AWS SSM Automation for Encrypting RDS Instances
Favicon
AWS Config Auto Remediation for Configuring S3 Lifecycle Rule
Favicon
A practical method for managing environment variables in microservices running on AWS ECS
Favicon
More Automation for Your AWS Resources, More Coffee Time for You!
Favicon
How to connect to an EC2 Private Instance via SSM Port Forwarding !
Favicon
Storing related secrets in Parameter Store for more efficient access
Favicon
Securely Connect to EC2 Instances Using Systems Manager (SSM)
Favicon
EC2 Spot instances : Comment simuler une fin d'instance et lancer une commande avant la terminaison
Favicon
AWS Systems Manager (SSM) Cross Region Replication
Favicon
3 Ways to Read SSM Parameters
Favicon
Connect to a Private Subnet AWS EC2 without Ingress
Favicon
Utilizando o Session Manager - AWS System Manager
Favicon
Amazon SSM Agent - Risk Of Security
Favicon
AWS SSM Agent - Connection Error
Favicon
Fetch Application Inventory using Systems Manager

Featured ones: