Load Balancers in System Design

Load balancing is a technique used in system design to distribute workloads across multiple resources (servers) to improve performance and scalability. It is an important consideration when building high-performance applications. On other side, load balancer is also a commonly discussed topic in technical interviews for system design roles.

What is a load balancer?

Load balancer is a device or software that sits between clients and a group of servers and distributes workloads evenly to prevent any one server from becoming overloaded. By balancing the workload, load balancer can improve throughput and performance of the system. It also increase its reliability by preventing servers from becoming overloaded and potentially failing.

Why do we need a load balancer?

We often use web services that quickly respond to our requests, but we may not be aware of the scale and complexity of the systems that are responsible for providing this fast response. When thousands of users request a service simultaneously, it can be a challenge to allocate these requests across multiple servers to ensure that system can handle the load.

If the load on servers increases too much, it can slow down the website and make it harder for users to get a fast and reliable response. One way to address this issue is to increase number of servers, but this brings its own challenges: How to distribute requests evenly across these servers? In this situation, load balancer can help to solve this problem by distributing requests across multiple servers in a way that ensures balanced workload. This will empower system to handle a large volume of requests.

Let's understand from another perspective!

Suppose we have several clients sending requests to a single server. When number of requests increases, there will be two critical issues:

  • Server overloading: There is a limit to how many requests a single server can handle. If number of requests exceeds this limit, server may become overloaded and unable to function properly.
  • Single point of failure: If the single server goes down for any reason, the entire application will become unavailable to users for a period of time. This can result in a poor user experience and impact overall reliability of the system.

How do we solve the above problems?

We can solve this scalability problem in two ways:

  • Vertical scaling: We can increase the power of our current server. However, there are limits to how much we can increase the capabilities of a single machine.
  • Horizontal scaling: We can add more servers to our system. In this situation, we can use a load balancer to distribute requests across multiple servers and increase our ability to handle a large number of requests by adding more servers. In addition to this, load balancer can also ensure that the service remains available, even if one of the servers goes offline. It continuously checks the server health and prevents traffic from being sent to servers that are unable to fulfill requests.

How load balancing works?

Where do we add a load balancer?

Load balancers can be placed at different points in a system to distribute workload. Some common places to use load balancers are:

  • Between clients and frontend web servers: This is often the first point of contact between the client and the system. So load balancer receives incoming requests from clients and distributes them across frontend web servers.
  • Between frontend web servers and backend application servers: In a system with multiple frontend web servers, load balancer can be used to distribute incoming requests from the web servers to the backend application servers.
  • Between backend application servers and cache servers: Load balancers can be used to distribute requests from the application servers to cache servers, which store frequently accessed data in memory to reduce response times.
  • Between cache servers and database servers: In systems with multiple cache servers, a load balancer can be used to distribute requests from the cache servers to the database servers, which store the actual data. This helps to ensure that the database servers are not overwhelmed with requests.

Types of load balancers

There are two main types of load balancers: software load balancers and hardware load balancers. The main difference between them is the level of customization and scalability that they offer.

Software load balancers are more flexible and offer more options for customization. They can be installed on a server and configured to meet the specific needs of a system. They are also easier to scale, as additional capacity can be added by adding more servers or virtual machines.

Hardware load balancers, on the other hand, are physical devices that are installed in a network. They are generally less flexible and offer fewer options for customization. However, they are often faster and more reliable than software load balancers, as they are dedicated hardware devices that are designed specifically for load balancing.

Overall, the choice between a software or hardware load balancer depends on the specific needs and requirements of a system. Software load balancers are generally more suitable for systems that require a high level of customization and scalability, while hardware load balancers are better suited for systems that require high performance and reliability.

Pros and cons of software load balancers


  • Flexibility: Highly flexible and can be easily adjusted to meet changing needs.
  • Scalability: Easily scale beyond their initial capacity by adding more instances of the software.
  • Lower cost: Cheaper than hardware load balancers because they can be installed on any standard device and do not require the purchase and maintenance of physical hardware.
  • Cloud-based load balancing: Software load balancers can be deployed in the cloud, which allows for easy scaling and cost savings.


  • Delay when scaling: There may be a delay when scaling beyond the initial capacity of a software load balancer, as the software needs to be configured and set up.
  • Ongoing upgrades: There may be additional costs for ongoing upgrades to keep the software load balancer up to date.

Examples of software load balancers

  • HAProxy: A TCP load balancer.
  • NGINX: An HTTP load balancer with SSL termination support.
  • mod_athena: Apache-based HTTP load balancer.
  • Varnish: A reverse proxy-based load balancer.
  • Balance: Open-source TCP load balancer.
  • LVS: Linux virtual server offering layer 4 load balancing.

Pros and cons of hardware load balancers


  • High performance: Offer low latency and consistent performance because load balancing software runs on specialized processors. They also have fast throughput.
  • Less prone to failure: Typically built on well-optimized and tested hardware platforms, with an underlying operating system that is optimized for performance and stability. This makes them less prone to failure compared to software load balancers.
  • Increased security: Increase security, as only authorized personnel can physically access the servers.


  • Higher cost: Require a higher upfront cost for purchase and maintenance.
  • Limited scalability: Struggle to scale beyond a certain number of requests because they are limited by the hardware they are installed on.
  • More human resources and expertise required: Require more human resources and expertise to configure and manage, compared to software load balancers.

Examples of hardware load balancers

  • F5 BIG-IP load balancer
  • CISCO system catalyst
  • Barracuda load balancer
  • Coytepoint load balancer
  • Citrix NetScaler

Explore popular load balancing algorithms used by load balancer.

Advantages of load balancing

  • Availability and scalability: Load balancers ensure that the application is always available and can scale as needed to meet demand.
  • Prevent server overload and single points of failure: Load balancers prevent a single server from being overloaded with requests, which can degrade the overall performance of the application.
  • Additional functionality: Load balancers can provide encryption, authentication, and other functionalities to secure, manage, and monitor the application. They can also provide efficient protection against DoS attacks.
  • Abstraction: End users only need to know the address of the load balancer, rather than the addresses of every server in the cluster, which provides a layer of abstraction.
  • Rolling software updates: Load balancers can be used to roll out software updates without taking the whole service down by taking out one server at a time.
  • Improved response time and throughput: Load balancers can minimize server response time and maximize throughput by distributing requests evenly across servers.
  • Health checks and server monitoring: Load balancers can do health checks and monitor the request handling capability of servers to ensure that they are functioning properly.
  • Dynamic server management: We can add or remove servers based on the number of requests.

Critical concepts to explore further

  • What is the difference between Load Balancer and Reverse Proxy?
  • Different Categories of Load Balancing: 1) Layer 4 (L4) load balancer 2) Layer 7 (L7) load balancer 3) Global server load balancing (GSLB)
  • Health check feature of the load balancer.
  • DNS load balancing vs Hardware load balancing
  • The application load balancer in designing several systems
  • Cloud load balancing

Thanks to Navtosh for his contribution in creating the first version of this content. If you have any queries/doubts/feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy system design, Enjoy algorithms!

Share Feedback

Coding Interview

Machine Learning

System Design

EnjoyAlgorithms Newsletter

Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.

Explore More Content

Follow us on

©2023 Code Algorithms Pvt. Ltd.

All rights reserved.