Load Balancing Algorithms

In distributed systems, a load balancer uses various load-balancing algorithms to efficiently distribute incoming traffic across multiple servers. So based on the load-balancing algorithms, load balancer chooses one of the servers based on factors like the nature of the workload, configuration of the servers, etc. The primary goal is to maximize throughput, minimize response time, and avoid server overloading.

At the high level, we can divide load balancing algorithms into two categories: static load balancing and dynamic load balancing. Let's understand each one of them.

Static load balancing algorithms: These algorithms rely on fixed configurations and work the same way regardless of the state of the backend server (server load, server health, etc). They are simpler to set up, but they can lead to an uneven distribution of requests. Examples of static load balancing algorithms are round robin, weighted round robin, source IP hash, URL hash, randomized algorithm, etc.

Dynamic load balancing algorithms: These algorithms take into account the state of the backend server (server load, server health or some other factors) when distributing requests. They require communication between the load balancers and servers, so they are slightly more complex but can distribute requests efficiently. Examples of dynamic load balancing algorithms are least connection method, weighted least connections method, least response time method, etc.

Now let's discuss various static and dynamic load balancing algorithms one by one.

Round robin algorithm

Round robin is a simple load-balancing algorithm that directs client requests to different servers based on a rotating list. Load balancer maintains a list of available servers and directs incoming requests in a round-robin fashion: the first request to the first server, the second request to the second server, and so on. When the load balancer reaches the end of the list, it goes back to the beginning and starts from the first server again.

Round robin load balancing explained with example

Benefits and drawbacks of the round-robin algorithm:

Easy to implement.
Balances traffic evenly between servers.
Does not consider server load or specifications, so there is a risk that a server with low capacity will receive a large number of requests.
Works best if every server in the load balancer list has roughly the same specifications. Otherwise, a low-processing server may have the same load as a high-processing server.

Weighted round robin algorithm

The weighted round-robin load-balancing algorithm is an advanced version of the simple round-robin. It distributes incoming requests based on the weighted score of the servers. The weight can be an integer that reflects the server's processing power or specifications.

Benefits and drawbacks of the weighted round-robin algorithm:

Slightly more complex than the simple round-robin.
Based on the weight, some servers may receive a higher proportion of the overall request count.
Works well with servers with different specifications.
Does not consider the current load of each server.

Random load balancing algorithm

The randomized load-balancing algorithm randomly distributes requests to servers using a random number generator. When a load balancer receives a request, randomized algorithm evenly distributes it to the servers. Like the round-robin, this algorithm works well for a group of servers with similar configurations.

Does not maintain specific knowledge about server capacities or states.
Simple to implement.
Due to random distribution, always a chance that some servers may receive more requests.

Source IP hash algorithm

The source IP hash load-balancing algorithm generates the hash value based on the source IP addresses and allocates the request to a specific server. In other words, when a request arrives, the load balancer directs the request to a specific server based on the calculated hash value.

This mapping is deterministic i.e. requests from the same source IP address will always be directed to the same destination server. This can be useful for maintaining session state in certain applications.
If the session is broken, the key can be regenerated, and the client request can be directed back to the same server that was originally handling it. In other words, this algorithm is useful when a dropped connection needs to be returned to the same server.
If a server goes down or needs maintenance, clients associated with that server may experience disruption until the server is back online.

URL hash algorithm

Similar to the source IP hash, the URL hash load-balancing algorithm generates a hash value based on the URL present in the client request. So request will be forwarded to one of the servers based on the hash value.

Here load balancer can cache the hashed value of the URL, so subsequent requests that use the same URL result in a cache hit and are forwarded to the same server.
URL hash algorithm is useful when load-balanced servers serve mostly unique content per server. In other words, all requests related to one process (e.g., running code) will go to one server, and all requests related to another process (e.g., payments) will go to another server, and so on.

Least connection method

The least connection load-balancing algorithm considers the current load on a server. It does this by sending requests to the server with the least number of active connections.

Useful when there are many persistent connections in the traffic that are unevenly distributed between the servers. If the servers are busy with long computations, the connections between client and server stay alive for a long period.
Load balancer does additional calculations to determine the server with the least connections. So this can increase the load on the load balancer itself. Even this overhead can increase with the number of servers and the frequency of connection updates.
This is stateful i.e. it relies on maintaining state information about each server's active connections.

Weighted least connections method

The weighted least connections load-balancing algorithm distributes load based on both the number of active connections to each server and the relative capacity of the server.

Some servers can handle more connections than others.
Servers are rated based on their processing capabilities.

Least response time method

The least response time load-balancing algorithm is a more advanced version of the least connection method. It forwards requests to the server with the fewest active connections and the least average response time seen so far. So it dynamically adapts to changes in server load and response times.

Relies on the time taken by a server to respond to a health monitoring request.
Server that responds the fastest receives the subsequent request.
Considers the number of active connections on each server.
May require additional calculations to determine the server with the least average response time.

Note: Our system might have multiple load balancers that use different server selection strategies. Enjoy learning, enjoy system design!

Types of Load Balancing Algorithms

Round robin algorithm

Weighted round robin algorithm

Random load balancing algorithm

Source IP hash algorithm

URL hash algorithm

Least connection method

Weighted least connections method

Least response time method

Share Your Insights

More from EnjoyAlgorithms

Self-paced Courses and Blogs

Coding Interview

Machine Learning

System Design

OOP Concepts

Our Newsletter