Logo

dev-resources.site

for different kinds of informations.

From Bedroom Disasters to Cloud Resilience: Explaining AWS DR Strategies To Anyone

Published at
12/10/2023
Categories
aws
resilience
cloud
beginners
Author
harinderseera
Categories
4 categories in total
aws
open
resilience
open
cloud
open
beginners
open
Author
13 person written this
harinderseera
open
From Bedroom Disasters to Cloud Resilience: Explaining AWS DR Strategies To Anyone

Disclaimer: While this article simplifies DR strategies, it's important to remember that DR is a complex topic with many nuances.

In recent years, I've come to realise that concepts many of us in information technology take for granted aren't always common knowledge among everyone working in the technology industry. It highlights an ongoing need in the tech industry to explain complex ideas clearly to broad audiences with little to no exposure to the ever-evolving technology world.

I have recently started writing articles to explain complex ideas in a way everyone can easily understand. My goal is to make technical subjects understandable to anyone. In this article, I aim to explain Amazon Web Services (AWS) disaster recovery strategies through a household analogy. By relating cloud computing concepts to familiar scenarios, I hope to present AWS concepts in a simple yet insightful way for all readers. I hope to bridge knowledge gaps and demystify cloud technology for the non-techie person.

To begin with, below are different DR strategies that can be implement in an AWS cloud. Namely:

  • Backup and Restore

  • Pilot Light

  • Warm Standby

  • Multi-site Active/Active

  • Multi AZ deployment

Image description
Ref

Now we will go through the house analogy to explain them individually. We will start with the Multi AZ deployment strategy.

Single AWS Region - Multi AZ Deployment Strategy

If you are wondering what a region is, I have written an article called "AWS Is A Zoo: Anyone Can Navigate the Cloud Jungle!" It explains what a region is using the Zoo analogy.

Image description

Back to our DR strategy, Imagine you have a house with three bedrooms. You and your partner occupy one room, while your two kids have their separate rooms. Unfortunately, while you were all out for dinner, the roof of one of your child's bedrooms collapses (as shown in the image above). Luckily, as your house has three rooms, with some adjustments like moving a bed or using an air mattress, your children can share a room temporarily while the repairs are being made to the other room. You won't have to make many changes to address the issue of where your kid would sleep.

This scenario resembles having a single AWS region with multiple Availability Zones (AZ). In case of a disaster such as a natural calamity or a technical failure that destroys one physical data centre (AZ), having your workload distributed across various Availability Zones within the same AWS Region can assist you in weathering the storm, whether it's a natural disaster or a technical glitch. If one of the AZs is unavailable, at least your workload can continue operating in the other AZs.

Multi AWS Region Strategy

Let's imagine you have two homes: a primary residence and a vacation home. These will be used to explain the remaining AWS disaster recovery strategies.

Backup and Restore Strategy

Image description

In your primary house, you have a fireproof vault where you regularly keep copies of your important documents such as birth certificate, driving licence and passport. In the event of a fire, you can retrieve these documents to identify yourself or use them for other purposes, such as home insurance claims.

Similarly, on AWS, the back and restore (DR) strategy involves regularly backing up your data to the cloud. This involves regularly backing up your data to the cloud. In the event of any disasters, you can use these backups to restore lost data and applications. This strategy can be applied to both single-region and multi-region implementations. For a single region, you can copy the data to a different account, while for a multi-region, you can copy the data into your secondary region and retrieve it when needed.

Pilot Light Strategy

Image description

Imagine a vacation home that you plan to use only occasionally. It's not fully furnished or equipped for daily living; it has a minimal/basic structure in place, just like in the picture above. You could make it habitable with some effort, but it's not ready for immediate occupancy.

This analogy closely resembles the pilot light disaster recovery (DR) strategy on AWS. The infrastructure in the recovery region is minimalistic, similar to the basic structure of a vacation home. It provides the core components needed to run your applications but lacks the fully configured environment required for production workloads. When a disaster occurs, you must provision additional resources, configure applications, and perform other manual tasks before the recovery region can fully handle production traffic.

Warm Standby Strategy

Image description

With the warm standby approach, your vacation home is fully built but needs to be fully furnished and stocked. It is ready for immediate use, though it may not have all the amenities and personal touches of your primary residence.

Similarly, the warm standby DR strategy on AWS involves maintaining a scaled down, but fully functional, copy of your production environment in the recovery region. This replica is kept up-to-date with data changes, but it may not be running at full capacity or serving all application components. When a disaster occurs, you can quickly failover to recovery region, minimising downtime.

Multi-Site Active/Active Strategy

Image description

With the active/active approach, both your primary residence and vacation home are fully equipped and ready for daily use. You can comfortably switch between them depending on your needs or preferences. You can have your family living in your vacation home while you reside in your primary home or vice versa.

The multi-site active/active DR strategy on AWS mirrors this concept. You have two production environments running simultaneously in different regions, each serving a portion of your application traffic. This strategy offers the highest availability and lowest recovery time objective but comes with the highest cost and complexity.

Closing Thoughts

In this article, I aimed to explain key AWS disaster recovery concepts in an easy-to-understand manner. By comparing cloud computing strategies to everyday household scenarios, my goal was to make complex technical ideas simple and accessible. Though concepts such as Availability Zones and recovery time objectives may be familiar to those with a technology background, they can be challenging for those without a tech background.

I hope that these simplified explanations shed light on AWS disaster recovery in a way that resonates with everyone. If this piece has helped demystify even one core concept for you, then it has achieved its purpose.


Thanks for reading!

If you enjoyed this article, feel free to share it on social media 🙂

Say Hello on: Linkedin | Twitter

Github repo: hseera

resilience Article's
30 articles in total
Favicon
Coroutines, Distributed Cache, Resilience, and Replication in Kotlin — Making a VMA’s application
Favicon
Chaos Engineering in Microservices
Favicon
DevOps: Shift Right for Real World Validation
Favicon
Mastering Long-Term Thinking - How to Build a Resilient and Innovative Organization
Favicon
Resilience in the Cloud - Fault Isolation Boundaries
Favicon
Make Adabas on Linux more resilient | IUG 2024
Favicon
Resilience in the Cloud - Availability vs Recoverability
Favicon
Resilience in communication between microservices using the failsafe-go lib
Favicon
Intro to Disaster Recovery
Favicon
Circuit Breakers in Go: Stop Cascading Failures
Favicon
Como construir uma aplicação escalável com Terraform e AWS
Favicon
Robot Ric From Blog Post to Best-Selling Novel
Favicon
Resilience Evaluation and Optimization Framework — REOF
Favicon
Beyond Technical Expertise
Favicon
Building Resilient Cloud Applications With .NET
Favicon
Mastering Microservices: Best Practices for Scalable and Resilient Architecture
Favicon
How I build resiliency on the financial service application
Favicon
La resiliencia como habilidad de vida
Favicon
How I build resiliency on the financial service application
Favicon
From Bedroom Disasters to Cloud Resilience: Explaining AWS DR Strategies To Anyone
Favicon
Embracing Our Values, Making a Difference!
Favicon
Ensuring Resilience: Safeguarding Azure Key Vault and Storages from Disasters
Favicon
Lessons Learned from Disaster Recovery on the Cloud - Embracing Resilience
Favicon
Circuit Breaker - Hope is not a Design Method
Favicon
Resilience
Favicon
Why Emotional Intelligence is Key to Success in Business and Life
Favicon
Patterns and practices for building resilient applications
Favicon
“Multi-AZ” in Amazon RDS and how it may differ from High Availability or resilience to failures
Favicon
YugabyteDB Recovery Time Objective (RTO) with PgBench: continuous availability with max. 15s latency on infrastructure failure
Favicon
3 steps to deal with slow pace at work.

Featured ones: