Logo

dev-resources.site

for different kinds of informations.

Database Scaling NLogN 📈

Published at
12/29/2024
Categories
systemdesign
database
webdev
computerscience
Author
ycwhencpp
Author
9 person written this
ycwhencpp
open
Database Scaling NLogN 📈

What is Database Scaling?

Database scaling refers to the process of increasing a database's capacity to handle growing workloads, such as higher numbers of queries, larger datasets, or more concurrent users.
The primary goal is to ensure the database maintains performance, reliability, and availability as demand increases.

There are two primary types of database scaling:

  • ⬆️ Vertical Scaling (Scaling Up)
  • ➡️ Horizontal Scaling (Sharding)

Vertical Scaling and Horizontal Scaling

🤔 Why Do We Need Database Scaling?

Suppose your current database server can handle up to 1000 queries per second (QPS). What happens if your website traffic spikes due to a new feature launch or increased popularity? Without scaling, your database will eventually fail under the load.
Scaling ensures that your database can meet the growing demands of your application, balancing data distribution, optimizing resources, and maintaining cost efficiency.

❓Why Not Just Use Master-Slave Replication?

If you're familiar with master-slave replication, you might think it eliminates the need for scaling. (If not, check out my blog post on Database Replication NLogN👈 to learn more). While replication is effective for scaling reads and providing fault tolerance, it doesn't solve all challenges:

  1. Write Bottlenecks: Replication doesn't improve write capacity since all writes must still go to the master
  2. Hotspots: Frequently accessed data can overload specific nodes
  3. Scaling Limits: Replication doesn't address global traffic distribution needs

🛠️ Types of Database Scaling

1.⬆️ Vertical Scaling (Scaling Up)

Vertical scaling involves upgrading the existing database server by adding more hardware, such as additional CPU, RAM, or storage.

Why Can't We Keep Adding More Hardware?

  1. High Costs: High-end servers are exponentially more expensive
  2. Hardware Limitations:
    • Limited motherboard slots for RAM
    • Restricted cores and threads in CPUs
  3. Single Point of Failure: One server creates critical risk
  4. Scalability Ceiling: Hardware improvements have physical limits

2.➡️ Horizontal Scaling (Sharding)

Horizontal scaling involves distributing data across multiple servers to divide the load. Each server (or shard) stores a unique subset of the data.

How Does Sharding Work?

Data is allocated to servers based on a sharding key. For example:

  • Use a hash function like user_id % 4 to determine which shard handles specific user data
  • Shard 0 stores data for users where the result is 0, and so on

Showing how data is directed towards shards

Benefits of Sharding

  • Distributes load evenly across multiple servers
  • Allows indefinite scaling by adding more shards

Challenges of Sharding

  1. Resharding Data: Redistributing data across new shards when capacity is reached
  2. Celebrity Problem: Disproportionate traffic to specific shards
  3. Joins and Denormalization: Difficulty in performing cross-shard joins
  4. Complex Application Logic: Applications must route queries correctly

📋 When to Choose Replication vs. Scaling?

  • Use replication for read-heavy workloads and fault tolerance
  • Use scaling for growing traffic and write-heavy workloads

🎯 Conclusion

Scaling your database is crucial for supporting rapidly increasing data traffic and ensuring high availability. Whether you choose vertical scaling or horizontal scaling, the decision depends on your specific requirements and growth projections.

🔮 Coming Next

Stay tuned for our next blog on message queues and data centers, two powerful techniques for reducing latency and improving application performance!

systemdesign Article's
30 articles in total
Favicon
Rate limiting : Global, Tumbling Window, and Sliding Window
Favicon
Designing the Spotify Top K
Favicon
Building RelaxTube: A Scalable Video Transcoding and Streaming Application
Favicon
Token Bucket Rate Limiter (Redis & Java)
Favicon
RabbitMQ: conceitos fundamentais
Favicon
CDNs in Distributed Systems: Beyond Caching for Better Performance
Favicon
Designing an Internet Credit Purchase System
Favicon
Context vs. State: Why Context is King in Software Systems
Favicon
Just thought about starting
Favicon
Hinted Handoff in System Design
Favicon
System Design: The Art of Balancing Trade-Offs
Favicon
Do you want to learn about System Design? I think this is a great article for you to get started with.
Favicon
Exploring the Intersection of Systems Engineering and Artificial Intelligence: Opportunities and Challenges
Favicon
From Concept to Deployment: The Lifecycle of a Systems Engineering Project
Favicon
Database less OTP- A concept
Favicon
Telemetry and Tracing: A Comprehensive Overview
Favicon
Asynchronous transaction in distributed system
Favicon
Fixed Window Counter Rate Limiter (Redis & Java)
Favicon
Kickstarting Weekly System Design Deep Dives: Building Scalable Systems
Favicon
Database Scaling NLogN 📈
Favicon
Navigating the World of Event-Driven Process Orchestration for Technical Leaders
Favicon
Load balancer vs Gateway vs reverse proxy vs forward proxy
Favicon
Kong API Gateway Setup Basic to advance usages
Favicon
Finding the Right Microsoft Platform for Your Applications
Favicon
PRESTO card Metrolinx fare system
Favicon
A Simple Guide for Choosing the Right Database
Favicon
loved reading it. Well Researched, Crisp and Informative #SystemDesign
Favicon
HTTP Caching in Distributed Systems
Favicon
HTTP Status Codes Explained
Favicon
Understanding Networking Communication Protocols

Featured ones: