dev-resources.site
for different kinds of informations.
Scaling Node.js: Handling 1 Million Requests Like a Pro
JavaScript is a single threaded, synchronous language. This means when users request to a NodeJS server, it runs on one Core of CPU of the server and doesn't use other cores.
Requirements
- Node application that can handle millions of requests in a very short time.
- Each request's response time shouldn't go over a threshold of 500ms.
- System must be fault tolerant.
I found lots of ways to handle this but found really a simple solution with NodeJS's own Cluster module
.
It uses all cores of the CPU of server and launches multiple instances of Node application in the same port. There is one main cluster which forks child processes (instances in other cores) and channels requests to the child processes. The child processes handles these requests as a standalone instance.
Code Implementation
Create Express application and install dependencies:
cd project-name
npm init -y
npm i express mongoose nodemon dotenv
General Approach
Then in the index.js
file, create an usual express application.
require('dotenv').config();
const express = require('express');
const connectDB = require('./config/database');
const app = express();
app.use(express.json());
connectDB();
app.use("/api/items", require("./routes/items"));
app.get("/", (req, res) => {
res.json({ message: "Welcome to the API" });
});
const PORT = process.env.PORT || 8000;
app.listen(PORT, () => {
console.log(`Worker ${process.pid} started - Server running on port ${PORT}`);
});
Now, run the application using nodemon index.js
command.
Then, install loadtest
package globally for load testing:
npm install -g loadtest
Now run the loadtest on the local server using this command:
loadtest -n 1000000 --rps 10000 http://localhost:8000 #use your port number
This command creates 1M reuests with maximum 10K rps (request per second).
This is the result:
Target URL: http://localhost:8000
Max requests: 1000000
Target rps: 10000
Concurrent clients: 3805
Running on cores: 6
Agent: none
Completed requests: 1000000
Total errors: 43446
Total time: 539.523 s
Mean latency: 97.7 ms
Effective rps: 1853
Percentage of requests served within a certain time
50% 10 ms
90% 265 ms
95% 543 ms
99% 1450 ms
100% 4509 ms (longest request)
-1: 43446 errors
We can see, almost 44K requests failed to respond.
Cluster Module
Let's use NodeJS Cluster Now.
In the index file, change the code in Cluster:
require('dotenv').config();
const cluster = require('cluster');
const numCPUs = require('os').cpus().length;
const express = require('express');
const connectDB = require('./config/database');
if (cluster.isMaster) {
console.log(`Master ${process.pid} is running`);
console.log(`Number of CPUs: ${numCPUs}`);
// Fork workers
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker, code, signal) => {
console.log(`Worker ${worker.process.pid} died`);
cluster.fork(); // Replace the dead worker
});
} else {
const app = express();
// Middleware
app.use(express.json());
// Connect to MongoDB
connectDB();
// Routes
app.use('/api/items', require('./routes/items'));
// Basic route
app.get('/', (req, res) => {
res.json({ message: 'Welcome to the API' });
});
const PORT = process.env.PORT || 8000;
app.listen(PORT, () => {
console.log(`Worker ${process.pid} started - Server running on port ${PORT}`);
});
}
Here the main Cluster forks the child cluster and uses all available Cores of the CPU.
When we run the application, we can see:
We can see it uses all the cores of the server.
Now, let's run the loadtest
. We'll see the result like this:
We can see, it got 100% success rate which is crazy! Also, the mean latency came down to 9.9ms from 97.7ms which is another crazy efficiency. Also, 99% of the requests responded success under 200ms where it took 1.5s previously.
Find the whole codebase in this repo: [https://github.com/fahadfahim13/node-scale.git](https://github.com/fahadfahim13/node-scale.git)
Featured ones: