Logo

dev-resources.site

for different kinds of informations.

Simian Army - Netflix - Disaster Recovery - AWS

Published at
1/19/2023
Categories
cloud
aws
netflix
disasterrecovery
Author
yogini16
Author
8 person written this
yogini16
open
Simian Army - Netflix - Disaster Recovery - AWS

Simian Army is a set of tools developed by Netflix for simulating failure scenarios in cloud infrastructure. It includes several different tools, such as Chaos Monkey and Latency Monkey, which are designed to randomly terminate instances or introduce latency in order to test the resilience of a system. The goal of Simian Army is to help engineers build systems that can withstand failures and remain available even in the face of unexpected events.
It's available for AWS and other cloud providers.

You can find the git repo here developed in java.

AWS Simian Army is a set of open-source tools that helps to identify potential availability risks and improve the resilience of your AWS infrastructure. The tools are designed to simulate different types of failures that can occur in a production environment, such as instance termination and network latency. By running these simulations, you can identify and fix weaknesses in your infrastructure before they cause a real outage. AWS Simian Army includes the following tools:

Chaos Monkey randomly terminates instances in an Auto Scaling group to test the system's ability to tolerate instance failures.

Latency Monkey induces artificial delays in our RESTful client-server communication layer to simulate service degradation and measures if upstream services respond appropriately. In addition, by making very large delays, we can simulate a node or even an entire service downtime (and test our ability to survive it) without physically bringing these instances down. This can be particularly useful when testing the fault-tolerance of a new service by simulating the failure of its dependencies, without making these dependencies unavailable to the rest of the system.

Conformity Monkey finds instances that don’t adhere to best-practices and shuts them down. For example, we know that if we find instances that don’t belong to an auto-scaling group, that’s trouble waiting to happen. We shut them down to give the service owner the opportunity to re-launch them properly.

Doctor Monkey taps into health checks that run on each instance as well as monitors other external signs of health (e.g. CPU load) to detect unhealthy instances. Once unhealthy instances are detected, they are removed from service and after giving the service owners time to root-cause the problem, are eventually terminated.

Janitor Monkey ensures that our cloud environment is running free of clutter and waste. It searches for unused resources and disposes of them.

Security Monkey is an extension of Conformity Monkey. It finds security violations or vulnerabilities, such as improperly configured AWS security groups, and terminates the offending instances. It also ensures that all our SSL and DRM certificates are valid and are not coming up for renewal.

10–18 Monkey (short for Localization-Internationalization, or l10n-i18n) detects configuration and run time problems in instances serving customers in multiple geographic regions, using different languages and character sets.

Chaos Gorilla is similar to Chaos Monkey, but simulates an outage of an entire Amazon availability zone. It verifies that our services automatically re-balance to the functional availability zones without user-visible impact or manual intervention.

By running these tools, you can gain insight into the resilience and availability of your infrastructure, and take steps to improve it

netflix Article's
30 articles in total
Favicon
How to Develop an OTT App like Netflix?
Favicon
Netflix Thailand: การผสมผสานเทคโนโลยี วัฒนธรรม และนวัตกรรมในออฟฟิศ
Favicon
Prime Video vs. Netflix: Which Streaming Service is Best for 2025?
Favicon
Use a proxy to unblock videos on YouTube or other sites
Favicon
🚀 Netflix's Secret Sauce: How AWS Streams Your Binge-Worthy Shows to 231 Million Couch Potatoes 🍿
Favicon
Video Streaming for Fitness: A Powerful Tool with Mogi I/O’s OTT Streaming Solution
Favicon
On Stealing People's Attention
Favicon
Unraveling the Enigmatic Thriller: Can This Breakout Netflix Whodunit Justify Its Sudden Popularity?
Favicon
How to Save on Netflix: A Global Subscription Hack
Favicon
How to Download Netflix Movies on MacBook for Offline Viewing
Favicon
Usando Consultas de Percolação do Elasticsearch, Netflix Aperfeiçoa Buscas Reversas Eficientemente
Favicon
Documentário que todo Profissional de TI deve Ver: O Dilema da Redes (2020).
Favicon
Building Netflix Clone with NextJs 13.4: Part 1
Favicon
Architectural Battle: Monolith vs. Microservices - A Netflix Story
Favicon
🚀The Netflix DevSecOps Project 🚀
Favicon
Unleashing the Power of Microservices: A Deep Dive into Their Importance and the Netflix Breakdown
Favicon
Unlocking Success: Netflix's Shift to AWS - A Hosting Reviews Perspective
Favicon
End to End Netflix data analytics and recommendation system project using Microsoft Azure tools
Favicon
Netflix Việt Nam chính thức tiến hành xác minh thiết bị thuộc hộ gia đình
Favicon
Netflix's DevOps Journey: From DVD Rentals to Global Streaming Dominance
Favicon
Netflix UI Clone
Favicon
Netflix UI Clone
Favicon
Skipping to the Good Parts: Implementing Netflix's Skip Intro Feature with Python Video Processing
Favicon
Simian Army - Netflix - Disaster Recovery - AWS
Favicon
Pure CSS Logo: Netflix
Favicon
Cara jualan Netflix di Shopee
Favicon
Making a Microservice More Resilient Against Downstream Failure
Favicon
GraphQL Using Netflix’s DGS Framework & Spring-Boot (Schema-First Approach)
Favicon
Use Netflix To Easily Improve Your Design Skills
Favicon
Streaming Service like Netflix: How to Make a Breakthrough

Featured ones: