Logo

dev-resources.site

for different kinds of informations.

Database Scaling NLogN ๐Ÿ“ˆ

Published at
12/29/2024
Categories
systemdesign
database
webdev
computerscience
Author
ycwhencpp
Author
9 person written this
ycwhencpp
open
Database Scaling NLogN ๐Ÿ“ˆ

What is Database Scaling?

Database scaling refers to the process of increasing a database's capacity to handle growing workloads, such as higher numbers of queries, larger datasets, or more concurrent users.
The primary goal is to ensure the database maintains performance, reliability, and availability as demand increases.

There are two primary types of database scaling:

  • โฌ†๏ธ Vertical Scaling (Scaling Up)
  • โžก๏ธ Horizontal Scaling (Sharding)

Vertical Scaling and Horizontal Scaling

๐Ÿค” Why Do We Need Database Scaling?

Suppose your current database server can handle up to 1000 queries per second (QPS). What happens if your website traffic spikes due to a new feature launch or increased popularity? Without scaling, your database will eventually fail under the load.
Scaling ensures that your database can meet the growing demands of your application, balancing data distribution, optimizing resources, and maintaining cost efficiency.

โ“Why Not Just Use Master-Slave Replication?

If you're familiar with master-slave replication, you might think it eliminates the need for scaling. (If not, check out my blog post on Database Replication NLogN๐Ÿ‘ˆ to learn more). While replication is effective for scaling reads and providing fault tolerance, it doesn't solve all challenges:

  1. Write Bottlenecks: Replication doesn't improve write capacity since all writes must still go to the master
  2. Hotspots: Frequently accessed data can overload specific nodes
  3. Scaling Limits: Replication doesn't address global traffic distribution needs

๐Ÿ› ๏ธ Types of Database Scaling

1.โฌ†๏ธ Vertical Scaling (Scaling Up)

Vertical scaling involves upgrading the existing database server by adding more hardware, such as additional CPU, RAM, or storage.

Why Can't We Keep Adding More Hardware?

  1. High Costs: High-end servers are exponentially more expensive
  2. Hardware Limitations:
    • Limited motherboard slots for RAM
    • Restricted cores and threads in CPUs
  3. Single Point of Failure: One server creates critical risk
  4. Scalability Ceiling: Hardware improvements have physical limits

2.โžก๏ธ Horizontal Scaling (Sharding)

Horizontal scaling involves distributing data across multiple servers to divide the load. Each server (or shard) stores a unique subset of the data.

How Does Sharding Work?

Data is allocated to servers based on a sharding key. For example:

  • Use a hash function like user_id % 4 to determine which shard handles specific user data
  • Shard 0 stores data for users where the result is 0, and so on

Showing how data is directed towards shards

Benefits of Sharding

  • Distributes load evenly across multiple servers
  • Allows indefinite scaling by adding more shards

Challenges of Sharding

  1. Resharding Data: Redistributing data across new shards when capacity is reached
  2. Celebrity Problem: Disproportionate traffic to specific shards
  3. Joins and Denormalization: Difficulty in performing cross-shard joins
  4. Complex Application Logic: Applications must route queries correctly

๐Ÿ“‹ When to Choose Replication vs. Scaling?

  • Use replication for read-heavy workloads and fault tolerance
  • Use scaling for growing traffic and write-heavy workloads

๐ŸŽฏ Conclusion

Scaling your database is crucial for supporting rapidly increasing data traffic and ensuring high availability. Whether you choose vertical scaling or horizontal scaling, the decision depends on your specific requirements and growth projections.

๐Ÿ”ฎ Coming Next

Stay tuned for our next blog on message queues and data centers, two powerful techniques for reducing latency and improving application performance!

computerscience Article's
30 articles in total
Favicon
Binary Made Easy โ€” Understand the Basics
Favicon
Understanding Lists in Python
Favicon
Truth Tables: Foundations and Applications in Logic and Neural Networks
Favicon
LinearBoost: Faster Than XGBoost and LightGBM, Outperforming Them on F1 Score on Seven Famous Benchmark Datasets
Favicon
External Merge Problem - Complete Guide for Gophers
Favicon
From 0 to O(n): A Beginner's Guide to Big O Notation
Favicon
Hereโ€™s why Julia is THE language in scientific programming
Favicon
Securing C++ iostream: Key Vulnerabilities and Mitigation Strategies
Favicon
Why Your Brain Ghosts Most of Your Memories!?
Favicon
I am currently reducing at least 22 proofs by at least 99312 steps total.
Favicon
Guide to TCP/IP Ports and UDP Ports
Favicon
Important Port Numbers in Networking and Open-Source Projects
Favicon
A Practical Approach to Problem-Solving in JavaScript
Favicon
CS50 - Week 6
Favicon
Quintum Computing And History
Favicon
Relational Database Design: DBMS
Favicon
๐Ÿš€ Say Hello to PaperLens! ๐Ÿ”Ž
Favicon
Database Scaling NLogN ๐Ÿ“ˆ
Favicon
LeetCode Meditations: Sum of Two Integers
Favicon
Is there an ethics issue in computer science? (EPQ)
Favicon
Confusion about coding field after introduced chatgpt and other AI models AI can make itself code and also make websites and apps etc. We have a carrier confusion because I am a BTech 1st year computer science student. Please help.
Favicon
Kubernetes Gateway API
Favicon
Recursion it is : LeetCode 494
Favicon
Simplifying the Cloud: Fundamentals of Cloud Computing with a Focus on AWS
Favicon
Trees in SQL
Favicon
Trees in SQL (part 2)
Favicon
Behavioural Analysis models for a project
Favicon
Choosing the Right Excel Consultant: Boost Efficiency and Productivity
Favicon
Pointers in C++: Memory Management, Arrays, and Smart Pointers
Favicon
Designing for Durability: How Precision Engineering Creates Tools That Last

Featured ones: