Logo

dev-resources.site

for different kinds of informations.

Swapping out microservices gracefully with the help of AWS

Published at
10/7/2024
Categories
kubernetes
cloudskills
aws
softwareengineering
Author
Derek Berger
Swapping out microservices gracefully with the help of AWS

Introduction

The AWS load balancer controller is a key enabler for running services in Amazon EKS, using AWS APIs to provision load balancer resources. But this controller can help with more than just everyday management of load balancers. For instance, it greatly simplified how my team released APIs during a major project to rewrite our services in a new language.

Background

Our application follows a common pattern for running microservices in EKS. Outside requests come into our clusters through application load balancers (ALBs). The ALBs’ target groups forward requests according to path-based rules that correspond to services’ endpoints.

Image descriptionALBs fronting EKS services

The load balancer controller manages our ALBs based on Ingress resources defined in services’ Helm manifests. We keep these manifests in our version control system, and deploy them through pull requests.

Here’s an abbreviated example from one of our services:

 ingress:
  ingressClassName: alb
  enabled: true
  annotations:
    alb.ingress.kubernetes.io/group.name: login
    alb.ingress.kubernetes.io/subnets: 'subnet-a,subnet-b,subnet-c'
alb.ingress.kubernetes.io/healthcheck-path: '/help-i-am-alive'
    alb.ingress.kubernetes.io/success-codes: '200,404'
alb.ingress.kubernetes.io/target-type: 'ip'
alb.ingress.kubernetes.io/backend-protocol: 'HTTPS'

When an Ingress is deployed, the controller provisions the ALB, applies the path-based rules, and creates the target group that points to the service’s pods. It handles additional behaviors for certificates, ELB access logs, health checks and more through annotations. If we deploy updates to an ingress, the controller keeps the ALB in sync with its definition.

In with the new (but not quite out with the old)

During the project to rewrite our microservices, we continued to define service and ingress resources in Helm manifests. A new challenge would be to run old and new services side by side while we incrementally rewrote and released individual APIs. We wanted requests for rewritten APIs to be forwarded to the new service, while requests for all other APIs remain forwarded to its older counterpart.

The Ingress Group feature made this possible in part by consolidating old and new Ingress resources under the same ALB with the original group.name annotation. When the team released an API, we just added a pathType: Exact rule for that endpoint and deployed its ingress.

Here is an excerpt from a new service’s ingress, with some pathType: Exact path-based rules:

ingress:
 ingressClassName: alb
 enabled: true
annotations:
   alb.ingress.kubernetes.io/group.name: login
paths:
 - path: '/api/login/path1'
   pathType: Exact
 - path: '/api/login/path2'
   pathType: Exact

Here again is the original service’s ingress, which has a single pathType: Prefix rule, catching anything that does not match the new service’s path-based rules.

ingress:
 ingressClassName: alb
 enabled: true
annotations:
   alb.ingress.kubernetes.io/group.name: login
paths:
 - path: '/api/login/'
   pathType: Prefix

Because we defined both ingresses with alb.ingress.kubernetes.io/group.name: login, the controller would apply both sets of rules to the original ALB, letting the new service steal requests, or so we hoped, from the original service.

Not so fast

The problem with this was that the pathType: Prefix would match every request to /api/login/, including /api/login/path1 and /api/login/path2. We had no guarantee that requests for those would be forwarded to the new service.

To solve this, we could have just replaced the Prefix path with Exact paths for all the APIs we still wanted forwarded to the old service. That would have spared us from creating a new ALB, but would add complexity and friction to our releases, requiring changes to two ingresses with every release.

Help from AWS

We found a more elegant solution with a subtle but powerful controller feature called group.order. By assigning a smaller order number to the new service, group ordering ensured the controller would find a match for its path rules first.

Here's the new service's Ingress again, now with alb.ingress.kubernetes.io/group.order:

ingress:
 ingressClassName: alb
 enabled: true
annotations:
   alb.ingress.kubernetes.io/group.name: login
   alb.ingress.kubernetes.io/group.order: 10
paths:
 - path: '/api/login/rewritten-path1'
   pathType: Exact
 - path: '/api/login/rewritten/path2'
   pathType: Exact

With that, we could set a higher group.order value for the original Ingress and leave it alone until all endpoints were transitioned. Then we just replaced all the pathType: Exact rules in the new service’s manifest with a pathType: Prefix rule and deleted the old service. The same approach worked for all of our services with Ingress resources.

Conclusion

The AWS load balancer controller's group.order feature has made it trivial for my team release new APIs. The experience reminds me that maintaining infrastructure as code provides benefits beyond everyday management of infrastructure. Features like group.order allow engineers to spend more time on features and less less time managing the infrastructure that they run on.

Featured ones: