Introduction to Database Replication (Types and Advantages)

The concept of database replication has evolved significantly in the last two decades due to the increasing use of distributed databases instead of databases with just a single node. But its core principles have remained the same over time. The reason is simple: The fundamental constraints of distributed systems have not changed significantly. In this blog, we will discuss the definition, types, advantages, disadvantages, and various tradeoffs related to database replication.

What is Database Replication?

Database replication is the process of keeping multiple copies of the same data in different servers (replicas) so that if one server goes down, the other servers can continue to serve data without any interruption or downtime.

  • Database replication provides several advantages like high performance, availability, and reliability. We will explore these benefits in a later section of this blog.
  • If the data that we are replicating does not change over time, then replication is easy: we just need to copy the data to every node once, and we are done. All the difficulty in replication lies in handling changes to replicated data. Think!
  • We can create a replica of an entire database or a subset of the database.

Database replication in distributed systems

Understanding database replication with an example

Suppose there is an online store that has a single database server that stores all the data related to the products, orders, and customers. If this server fails due to some reason (hardware issues, software issues, etc.) the website would become unavailable.

To solve this problem, we can use database replication. The website can set up multiple database servers that replicate data from the primary database server. If the primary server goes down, other servers can take over and continue to serve requests without any data loss.

Database replication techniques based on server architecture

In database replication, it's important to ensure that any updates made to the data on one node are reflected on all other nodes. So, we need efficient replication techniques that can ensure that all nodes have the most up-to-date version of the data. There are three popular types of database replication techniques based on server architecture. Each of these strategies has its advantages and disadvantages.

Single-leader replication

This is also known as Master-Slave or Active-Passive replication. In this architecture, there is a single leader (master) and several follower replicas (slaves). Here, all write requests are served by the single leader node, and all read requests are served by the leader or any of the follower nodes. After serving the write request, the leader node sends a stream of data changes to the follower nodes to update their current state of data.

The best use case would be to use the leader for only write queries and distribute read requests across followers. This removes the load from leader and allows read requests to be served by nearby replicas.

  • As write operations are concentrated on a single node, if the leader node fails, the system may become unavailable until a new leader is elected, or the original leader node is restored.
  • During the replication process, we need to handle one of the critical problems: reads from follower nodes may not reflect the latest changes as there can be some delay in the replication process (replication lag).
  • This architecture is a good option when the read-write ratio is very high.

We have covered this idea of replication in a separate blog: Complete overview of Master-Slave Replication.

How single-leader replication technique works?

Multi-leader replication

There is one big problem with leader-based replication: If the single master is unavailable due to some reason, then we cannot perform write operations to the leader database until we upgrade another slave replica as a leader. So a common solution to such a problem is to use multi-leader replication. This is also called master-master or active-active replication.

Here, clients can send write requests to one of several leader nodes, and each leader works as a follower for the other leader. So the replication process happens in the same way: Whenever a leader node performs the write operation, it will forward streams of data change to all other leader nodes and follower nodes.

There are some challenges and complexities in multi-leader architecture. For example, conflicts can arise when two or more nodes receive conflicting write requests simultaneously. It’s important to have conflict resolution mechanisms in place to ensure data consistency. Note: In the near future, we will discuss multi-leader in detail in a separate blog.

How multi-leader replication technique works?

  • When data needs to be available in multiple regions, a multi-leader architecture can help reduce latency and improve performance. The idea is simple: Each region can designate its own leader node.
  • By having multiple leader nodes, a multi-leader architecture can provide redundancy and ensure that the system remains available even if one or more nodes fail. It can also enable write scaling because write requests can be distributed across multiple nodes.

No-leader replication

This replication is also known as Leader-less replication. In this architecture, clients send each write request to several nodes and read from several nodes in parallel to detect and correct nodes with stale data. There is no concept of a leader in this approach, which allows any replica to directly accept writes from clients.

Leaderless replication can provide high availability and fault tolerance. Because there is no single point of failure, the system can continue to function even if some nodes fail or become unavailable. It can also provide high read and write throughput because requests can be distributed across multiple nodes.

This method also poses challenges for synchronization, as it can be difficult to ensure that all nodes have the same view of the data at all times. In addition, handling conflicts that may arise from concurrent writes can be complex, and careful design and implementation are necessary to ensure data consistency.

Note: In the near future, we will discuss multi-leader in detail in a separate blog.

Database replication techniques based on the timing of data transfer

Asynchronous replication

In the Asynchronous Replication strategy, the leader node responds immediately to the client after updating its own copy of the data, without waiting for the changes to be propagated to the followers.

However, there is a risk of data loss without the client’s knowledge since the confirmation comes before the main replication process. This is because replication happens in the background and the leader asynchronously propagates the changes to the followers. If the leader node crashes before the changes can propagate to any of the followers, the changes in data that are not propagated are lost permanently.

Despite this disadvantage, Asynchronous Replication is the default strategy for most data stores since it offers flexibility and ease of use. The idea is simple: the client is blocked only for the duration that the write happens on the master, after which the client can continue to function as before.

Synchronous replication 

In Synchronous Replication, once the Master node updates its own copy of the data, it initiates the write operation on its replicas. The replicas receive the update, apply the change to their copy of the data, and then send confirmation to the Master. Once the Master receives confirmation from all replicas, it responds to the client and completes the operation.

Synchronous Replication ensures that the replicas are always in sync and consistent with the Master, making this setup fault-tolerant by default. Even if the Master crashes, the entire data is still available on the replicas, so the system can easily promote any one of the replicas as the new Master and continue to function as usual.

One major disadvantage of Synchronous Replication is that the Client and the Master can remain blocked if a replica becomes non-responsive due to a crash or network partition. In other words, the Master will continue to block all writes until the affected replica becomes available again, bringing the entire system to a halt. So having a large number of replicas in the setup can result in longer block times for the client.

There are many trade-offs to consider with replication: for example, whether to use synchronous or asynchronous replication and how to handle failed replicas. Explore and think!

Other types of data replication techniques

Full replication

In full replication method, we copy the entire original database at every replica. This makes data highly available (all replicas are constantly updated) and decreases query execution time (data can be fetched from any closest replica). But it can be slow to update the replicas because the entire database needs to be copied at every replica’s location.

Full replication is useful when users at different locations need to see the same view of the data. For example, users looking for football match scores need to see the same details about the match, regardless of their location.

Partial replication

In partial replication, we store a copy of only a selected part of data from the original database at each replica. So the type and importance of the data determine the number of replicas required. The best thing is: Update process of each replica is fast because each replica only receives a portion of the entire database.

But the problem is: If the local replica does not contain some required data, it needs to be fetched from the original database. This can increase the query execution time. So, Partial replication is useful when one wants to provide an isolated view of data based on their location.

For example, Suppose an online fashion retailer sells clothing items in different regions of the world. The retailer may have different inventories in different regions depending on local demand, weather, and cultural preferences. By replicating only the relevant data to each location, the retailer can ensure that customers in each region see only the relevant products that are available. This reduces the load on the system by handling only the necessary data.

Advantages of database replication

The critical question is: Why do we want to replicate data? There are several reasons.

Improve performance

  • Replication helps us scale out the number of machines that can serve read and write queries, which increases read throughput and allows more queries to be processed in parallel.
  • Every user wants to access data from a nearby source because communication speeds are higher compared to sources that are geographically distant from the user. So replication helps us keep data geographically close to users, which reduces latency. The best example is CDN.

High Availability and Reliability

  • If one of your database servers is destroyed by a natural disaster, data is still preserved. We do not need to worry about data loss because data is replicated across multiple locations.
  • By replicating data across different database servers, our website remains in operation even if a database server is down (or offline) due to maintenance or some other reasons.

Disadvantages of database replication

While database replication can provide many benefits, there are also several disadvantages that should be considered before implementing it.

  • Replicating data across multiple servers can increase the complexity of the database system, which can make it more difficult to manage.
  • Implementing and maintaining a replication system can be expensive, as it requires additional hardware, software, and IT resources.
  • Replication lag can occur when changes made to the master database are not immediately propagated to the replicated databases. This can result in inconsistent data across different instances of the database.
  • Data conflicts can occur when changes are made to the same data in different instances of the database. This can result in data inconsistencies or even data loss if conflicts are not properly resolved.
  • Replication systems require ongoing maintenance, including monitoring, troubleshooting, and periodic updates. This can add to the overall maintenance overhead of the database system.

References

  • Designing data-intensive applications by Martin Kleppmann
  • Web Scalability for Startup Engineers by McGraw-Hill
  • Understanding distributed systems by Roberto Vitillo

We will keep updating this blog with more insights on database replication. If you have any queries or feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy system design!

Share Feedback

Coding Interview

Machine Learning

System Design

EnjoyAlgorithms Newsletter

Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.

Explore More Content

Follow us on

©2023 Code Algorithms Pvt. Ltd.

All rights reserved.