How We Optimised Our Application to serve 100 requests/sec with response time of 10ms
Not a long time back, I was working on a side project. I was trying to make URL shortner service with the aim to make it serve 100 requests per second, with each request taking 15ms or less. This blog outlines the steps I took to optimise my system and achieve my goals
Before I start, I want to thank my mentors, Abhinav Dhasmana, Atul R, and Kakul Gupta, for their guidance. Without their mentoring I would not have been able to convert my goals to reality
Table of contents:
- Introduction — The system
- The initial System and hunt for optimisation points
- Optimisation 1 — Efficient writes: DB optimisation
- Optimisation 2 — Efficient reads: Caching
- Optimisation 3 — Multiple instances
Introduction — The system
The project which we built was a URL shortner service, similar to bit.ly. The goal of the system is to provide user, the ability to shorten a long URL. Making it easy to share across the internet.
The tech stack used are as follows:
- HapiJs (Nodejs Framework) as Backend
- Postgres as database
- Sequelize as ORM
- Redis for caching
- Jmeter for load testing
In this blog, Im going to restrict myself by talking only about the performance optimisation. If you are interested in learning about the system, and various design decision that we took, head over to Abhinav’s article. This blog, goes in depth about system design of a URL shortner service.
The initial System and hunt for optimisation points
The flow of the initial system is as follows:


With this implementation, we performed a load testing using JMeter. We got response time of 40ms. Following is the screenshot for reference:

To find the points of optimisation in our system, we divided the application in 3 parts. These are:
- Database optimisation for efficient read/write operations
- Since our application has more read operations than write. Use cache
- Utilising all cores of our hardware
I will now discuss, how we optimised these points individually
Optimisation 1 — Efficient writes: DB optimization
This optimisation affects the read/write of the URL to the database. Since, our application depends heavily on the read/write speed of the database, we set up database pooling so that we have a pool of open database connections. This reduces the time required to create a connection to database
We set up a pool of connection range between 5 and 20. With idle timeout set as 10s
Optimisation 2 — Efficient reads: Caching
If we trace the happy path of user testing, we can observe that a user will write a short url once. Whereas, the read for the same URL will be multiple times by multiple users. To reduce the time of reading database, we introduced cache.
The cache that we decided to go forward was Redis. After incorporating caching, our flow became as follows:

Optimisation 3 — Multiple instances
NodeJs is single threaded. It cannot utilise all the cores of our system. To achieve the utilisation of multiple cores, we used pm2. This enabled our application to handle more input connections. This pertains to the 100 req/sec part
Results
After incorporating all the above mentioned optimisation, we achieved our target. The final application gave us response time of 5ms. Refer to following screenshot for reference:

The complete code can be found here
If you liked this post, please clap for this post. Your claps motivates me to keep writing articles