Logo

dev-resources.site

for different kinds of informations.

Automating patching with AWS Systems Manager

Published at
11/13/2023
Categories
terraform
aws
ssm
powershell
Author
iskander
Categories
4 categories in total
terraform
open
aws
open
ssm
open
powershell
open
Author
8 person written this
iskander
open
Automating patching with AWS Systems Manager

The code that accompanies this blogpost can be found here

Recently I've been looking into patching Windows servers that have dependencies between them, using AWS Systems Manager.

The use-case was an application that exists of web servers, middleware servers and a database server.

Application diagram

The web servers have connections open to the database server, and the middleware servers run processes that get information from the database server.

The servers were patched manually, by stopping the services on the web servers and middleware servers first and checking that all middleware services were stopped, before stopping the databases. Once that was done, the servers were updated. After patching, the databases were first brought back online, before starting the middleware services and the web services again.

To set this up, I created some PowerShell scripts (with a little bit of SSM variable flavour) to be run on the instances to stop and start the services, as well as checking the services before continuing to the next step. These scripts were put as SSM documents, to be called from an automation document.

Example script (Start-Components.ps1):

try {
  $_serverRole = "{{ServerRole}}" # This is an SSM variable reference
  $_fqdn = "$((Get-WmiObject Win32_ComputerSystem).DNSHostName).$((Get-WmiObject Win32_ComputerSystem).Domain)"
  Write-Output "[INF] Starting Components on $($_fqdn) with server role '$_serverRole'"

  switch ($_serverRole) {
    Web { 
      Write-Output "[INF] Setting Startup Type for web services where the current StartType is Manual to Automatic and starting them."

      Get-Service iisadmin | Where-Object StartType -eq "Manual" | Set-Service -StartupType Automatic -Status Running
      Get-Service w3svc | Where-Object StartType -eq "Manual" | Set-Service -StartupType Automatic -Status Running
    }
    Middleware {  
      Write-Output "[INF] Doing stuff to enable the middleware services to start."

      # Your code here
    }
    Database {  
      Write-Output "[INF] Setting Startup Type for all database services where the current StartType is Manual to Automatic and starting them."
      Get-Date -Format "yyyy-MM-dd HH:mm:ss"

      Get-Service *sql* | Where-Object StartType -eq "Manual" | Set-Service -StartupType Automatic -Status Running

      Write-Output "[INF] Making sure all database services are started before continuing."
      # When there are no services that match the name, the while loop will not be entered.
      while (Get-Service *sql* | Where-Object Status -ne Running) {
        Write-Output "[DEB] [$(Get-Date -Format "yyyy-MM-dd HH:mm:ss")] Not all database services have started yet. Waiting a little longer."
        Start-Sleep -Seconds 60
      }
    }
    Default { }
  }
}
catch {
  Write-Output "[ERR] Failed to start components!"
  Write-Error $Error[0] -ErrorAction Continue
  exit 1
}
Enter fullscreen mode Exit fullscreen mode

Which is consumed to create an SSM document using Terraform:

resource "aws_ssm_document" "patching_start_components" {
  name          = "Patching-StartComponents"
  document_type = "Command"
  target_type   = "/AWS::EC2::Instance"

  content = jsonencode({
    schemaVersion = "2.2"
    description   = "Patching Post-install Start Components Document"
    parameters = {
      ServerRole = {
        type        = "String"
        description = "Role of the server (Web, Middleware, Database, None)"
        default     = "None"
        allowedValues = [
          "Web",
          "Middleware",
          "Database",
          "None",
        ]
      }
    }
    mainSteps = [
      {
        action = "aws:runPowerShellScript"
        name   = "StartComponents"
        precondition = {
          StringEquals = [
            "platformType",
            "Windows"
          ]
        }
        inputs = {
          runCommand = split("\n", file("${path.cwd}/powershell_scripts/Start-Components.ps1"))
        }
      }
    ]
  })
}
Enter fullscreen mode Exit fullscreen mode

Using an automation document, we can orchestrate the flow of patching. In the example code, I've also included a method to patch servers of the same function at different times. For this, the option PatchWindow has been added, with allowed values Monday and Wednesday. The output of each step is redirected to an encrypted CloudWatch log-group.

resource "aws_ssm_document" "patching_automation" {
  name            = "Patching-Automation"
  document_type   = "Automation"
  document_format = "YAML"

  content = <<EOT
description: |-
  # Patching Automation

  This script provides a staged patching experience. Services are stopped in a specific order on specific instances after which patching is run, and services are started again on servers in reverse order.
schemaVersion: '0.3'
parameters:
  PatchWindow:
    type: String
    allowedValues:
      - Monday
      - Wednesday
    description: Patch-window to run for. Determines which servers are affected.
mainSteps:
  - name: StopWebServerServices
    action: 'aws:runCommand'
    inputs:
      DocumentName: ${aws_ssm_document.patching_stop_components.name}
      Targets:
        - Key: 'tag:ServerRole'
          Values:
            - Web
        - Key: 'tag:PatchWindow'
          Values:
            - '{{PatchWindow}}'
      Parameters:
        ServerRole: Web
      CloudWatchOutputConfig:
        CloudWatchLogGroupName: ${aws_cloudwatch_log_group.automated_patching.name}
        CloudWatchOutputEnabled: true
    description: Stop the services on the web servers
    nextStep: StopMiddlewareServices
    onFailure: 'step:StartWebServerServices'
  - name: StopMiddlewareServices
    action: 'aws:runCommand'
    inputs:
      DocumentName: ${aws_ssm_document.patching_stop_components.name}
      Targets:
        - Key: 'tag:ServerRole'
          Values:
            - Middleware
        - Key: 'tag:PatchWindow'
          Values:
            - '{{PatchWindow}}'
      Parameters:
        ServerRole: Middleware
      CloudWatchOutputConfig:
        CloudWatchLogGroupName: ${aws_cloudwatch_log_group.automated_patching.name}
        CloudWatchOutputEnabled: true
    description: Stop the services on the middleware servers
    nextStep: StopDatabaseServices
    onFailure: 'step:StartMiddlewareServices'
  - name: StopDatabaseServices
    action: 'aws:runCommand'
    inputs:
      DocumentName: ${aws_ssm_document.patching_stop_components.name}
      Targets:
        - Key: 'tag:ServerRole'
          Values:
            - Database
        - Key: 'tag:PatchWindow'
          Values:
            - '{{PatchWindow}}'
      Parameters:
        ServerRole: Database
      CloudWatchOutputConfig:
        CloudWatchLogGroupName: ${aws_cloudwatch_log_group.automated_patching.name}
        CloudWatchOutputEnabled: true
    description: Stop the services on the database servers
    nextStep: PatchServers
    onFailure: 'step:StartDatabaseServices'
  - name: PatchServers
    action: 'aws:runCommand'
    inputs:
      DocumentName: AWS-RunPatchBaseline
      Targets:
        # Uncomment the following lines to only patch specific server-roles
        # - Key: 'tag:ServerRole'
        #   Values:
        #     - Web
        #     - Middleware
        #     - Database
        - Key: 'tag:PatchWindow'
          Values:
            - '{{PatchWindow}}'
      Parameters:
        Operation: Install
        RebootOption: RebootIfNeeded
      CloudWatchOutputConfig:
        CloudWatchLogGroupName: ${aws_cloudwatch_log_group.automated_patching.name}
        CloudWatchOutputEnabled: true
    description: Patch the servers
    nextStep: StartDatabaseServices
    onFailure: Abort
  - name: StartDatabaseServices
    action: 'aws:runCommand'
    inputs:
      DocumentName: ${aws_ssm_document.patching_start_components.name}
      Targets:
        - Key: 'tag:ServerRole'
          Values:
            - Database
        - Key: 'tag:PatchWindow'
          Values:
            - '{{PatchWindow}}'
      Parameters:
        ServerRole: Database
      CloudWatchOutputConfig:
        CloudWatchLogGroupName: ${aws_cloudwatch_log_group.automated_patching.name}
        CloudWatchOutputEnabled: true
    description: Start the services on the database servers
    nextStep: StartMiddlewareServices
    onFailure: Abort
  - name: StartMiddlewareServices
    action: 'aws:runCommand'
    inputs:
      DocumentName: ${aws_ssm_document.patching_start_components.name}
      Targets:
        - Key: 'tag:ServerRole'
          Values:
            - Middleware
        - Key: 'tag:PatchWindow'
          Values:
            - '{{PatchWindow}}'
      Parameters:
        ServerRole: Middleware
      CloudWatchOutputConfig:
        CloudWatchLogGroupName: ${aws_cloudwatch_log_group.automated_patching.name}
        CloudWatchOutputEnabled: true
    description: Start the services on the middleware servers
    nextStep: StartWebServerServices
    onFailure: Abort
  - name: StartWebServerServices
    action: 'aws:runCommand'
    inputs:
      DocumentName: ${aws_ssm_document.patching_start_components.name}
      Targets:
        - Key: 'tag:ServerRole'
          Values:
            - Web
        - Key: 'tag:PatchWindow'
          Values:
            - '{{PatchWindow}}'
      Parameters:
        ServerRole: Web
      CloudWatchOutputConfig:
        CloudWatchLogGroupName: ${aws_cloudwatch_log_group.automated_patching.name}
        CloudWatchOutputEnabled: true
    description: Start the services on the web servers
    isEnd: true
EOT
}
Enter fullscreen mode Exit fullscreen mode

The automation document allows for some error-handling as well. As you can see in the example, when the step StopMiddlewareServices fails, it will skip to step StartMiddlewareServices (defined with the line onFailure: 'step:StartMiddlewareServices') and will proceed from there.

Once we have the automation document in place, we can create maintenance windows with an associated task, to execute the automation document for that triggers automatically executing the automation document.

resource "aws_ssm_maintenance_window" "install_window_monday" {
  enabled  = true
  name     = "patch-window-monday"
  schedule = local.patching.cron_patching_monday
  duration = 4
  cutoff   = 2
}

resource "aws_ssm_maintenance_window_task" "task_install_patches_monday" {
  window_id = aws_ssm_maintenance_window.install_window_monday.id
  name      = "install-patches-monday"
  task_type = "AUTOMATION"
  task_arn  = aws_ssm_document.patching_automation.name
  priority  = 5

  task_invocation_parameters {
    automation_parameters {
      document_version = "$LATEST"

      parameter {
        name   = "PatchWindow"
        values = ["Monday"]
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

In this example, instances with a tag PatchWindow with a value of Monday will be targeted for the maintenance task.

After applying the code to your environment, instances can be included by setting two tags on them.
PatchWindow determines the maintenance window the instance will be included in. In this example, valid values are Monday and Wednesday.
ServerRole determines which actions in the PowerShell scripts will be taken. In this example, valid values are Web, Middleware, Database or None.

ssm Article's
30 articles in total
Favicon
The re-re-rebirth of AWS Systems Manager
Favicon
How can I enforce MFA before switching roles and using SSM login in AWS?
Favicon
EC2 instance deployment unification across AWS Organizations
Favicon
ECS Exec Usage Guide
Favicon
Gerenciamento de alta latΓͺncia com AWS CloudWatch e AWS Systems Manager
Favicon
How to β€” AWS Auto Stop/Start of EC2 Instances using Tags
Favicon
Use AWS StepFunctions for SSM Patching Alerts
Favicon
Port Forwarding to Amazon MQ
Favicon
NestJS Configuration Secrets Made Easy with configify
Favicon
No-ssh deployment to EC2 using ansible and AWS Systems Manager
Favicon
Automating patching with AWS Systems Manager
Favicon
[AWS] How To Install Cloud Watch Agent To EC2 Linux With SSM
Favicon
Create a Secure VPC with SSM-Managed Private EC2 Instances Using the AWS CLI
Favicon
Stop/Start RDS Instances Automatically Using System Manager for Cost Optimization
Favicon
How to debug running CodeBuild builds in AWS Session Manager
Favicon
AWS SSM Automation for Encrypting RDS Instances
Favicon
AWS Config Auto Remediation for Configuring S3 Lifecycle Rule
Favicon
A practical method for managing environment variables in microservices running on AWS ECS
Favicon
More Automation for Your AWS Resources, More Coffee Time for You!
Favicon
How to connect to an EC2 Private Instance via SSM Port Forwarding !
Favicon
Storing related secrets in Parameter Store for more efficient access
Favicon
Securely Connect to EC2 Instances Using Systems Manager (SSM)
Favicon
EC2 Spot instances : Comment simuler une fin d'instance et lancer une commande avant la terminaison
Favicon
AWS Systems Manager (SSM) Cross Region Replication
Favicon
3 Ways to Read SSM Parameters
Favicon
Connect to a Private Subnet AWS EC2 without Ingress
Favicon
Utilizando o Session Manager - AWS System Manager
Favicon
Amazon SSM Agent - Risk Of Security
Favicon
AWS SSM Agent - Connection Error
Favicon
Fetch Application Inventory using Systems Manager

Featured ones: