For large-scale software applications, System design provides a high-level understanding of the components and their relationships. In other words, it helps us to define architecture by breaking down the system into small components and making it easier to understand.
So it is critical for software engineers or tech professionals to understand various system design concepts to make informed decisions about scalability, performance and various tradeoffs. On another side, system design is also one of the important concepts that tech companies ask during the interview process.
System design deals with designing a system that meets functional and non-functional business requirements. While designing such systems, one needs to think about various components, complexity, performance, etc. So it would be best to have an excellent understanding of essential system design terms.
We need to know about scalability(horizontal and vertical), caching, load balancing, data partitioning, various types of databases, network protocols, database sharding, and many more. Similarly, we need to consider various tradeoffs like: latency vs throughput, performance vs scalability, consistency vs availability, etc. Overall, system design is an open-ended discussion topic. That's why most top tech firms prefer to have one or two system design interview rounds.
So let’s dive in to get familiar with the essential concepts used in system design.
Availability means: System should always remain up and return the response of a client request. In other words, whenever any user want to use the service, system must be available and satisfy user request.
Availability can be quantified by measuring the percentage of time system remain operational in a given time window. We usually define availability percentages of a system in terms of number of 9s (as shown in the table below).
For designing a highly available system, we need to consider several factors like redundancy, load balancing, failover mechanisms, and disaster recovery strategies. Our goal should be to minimize downtime and ensure that system is always accessible and functioning as intended.
Throughput is one of the key metrics to measure the performance of a system. It is the amount of data or the number of requests that can be processed within a given time period. In other words, we use throughput to measure the system capacity to handle multiple requests or data streams concurrently.
High throughput is important when we are handling a large volume of data or a high number of concurrent requests. One excellent way to achieve high throughput is: Splitting up the requests and distributing them to various machines.
Latency is another important metric related to the performance of a system. It is a measure of the time duration to produce the result. In other words, it is the time taken by a request to receive a response from a server. This includes the time taken for a request to travel through the network, processing the request, database queries, and other operations before generating the response.
Lower latency means: Quick response times, higher system performance and a smooth user experience!
Almost every system has dependency on networks. It acts as a platform for communication between users and servers or different servers. On another side, protocols are the rules governing how servers or machines communicate over the network. Some of the common network protocols are HTTP, TCP/IP, etc.
Load balancers are machines that balance the load among various servers. To scale the system, we can add more and more servers. So there must be a way to direct requests to these servers so that there is no heavy load on one server. In other words, load balancers are used to to prevent a single point of failure.
Proxy is present between the client and server. When a client sends a request, it passes through the proxy and then reaches the server. Proxies are of two types: Forward Proxy and Reverse Proxy.
The forward proxy acts as a mask for clients and hides the client’s identity from the server. Similarly, Reverse proxy acts as a mask for servers and hides server’s identity from the response.
Systems are associated with data that needs to be stored for later retrieval. One way to classify databases is based on their structure, which can be either relational or non-relational.
Relational databases, such as MySQL and PostgreSQL, enforce strict relationships among data and have a highly structured organization. Non-relational databases, such as Cassandra and Redis, have flexible structures and can store both structured and unstructured data. They are often used in systems that are highly distributed and require high speed.
Database Partition is the way of dividing database into smaller chunks to increase performance of the system. It is another critical concept used to improve latency and throughput so that more and more requests can be served. Here is the list of some popular databases based on Stack Overflow Developer Survey 2021.
Relational and Non-Relational Databases ensure different types of compliance. Relational Databases are associated with ACID compliance, while Non-Relational associated with BASE compliance.
ACID: Atomicity, Consistency, Isolation, Durability
ACID compliance ensures the relational nature of databases and ensures that transactions are executed in a controlled manner. A transaction is an interaction with the database.
BASE: Basically Available Soft State Eventual Consistency
BASE compliance maintains integrity of NoSQL databases and ensures their proper functioning. It is a key factor in building the scalability of NoSQL databases.
While designing any application, one needs to be clear about the type of storage according to the system requirements. If system is distributed in nature and scalability is essential, then NoSQL databases are the best choice to go with. No-SQL databases are also preferred when amount of data is huge.
Simultaneously, SQL databases are favourable when data structure is more important. It is generally preferred when queries are complex and databases require fewer updates. However, there is always a trade-off when choosing between NoSQL vs SQL database. Sometimes, according to business requirements, a Polyglot architecture comprising both SQL and NoSQL databases is used to ensure application performance.
As services grow and number of requests increases, system can become slow and performance may be affected. Scaling is the best way to address this issue, which involves increasing the capacity of system to handle more requests. There are two main ways to scale: horizontal scaling and vertical scaling.
Horizontal scaling involves adding more servers to distribute requests. This allows system to handle more traffic by spreading it across multiple servers. On the other hand, vertical scaling increases the capacity of a single machine by upgrading it with more resources such as CPU, memory, and storage. This allows the system to handle more traffic by increasing the capacity of a single machine.
Caching is an important technique that helps to improve the performance of a system by reducing its latency. It involves storing frequently used data in a cache so that it can be accessed quickly, instead of querying the database. However, implementing a cache also adds complexity to the system because it is important to maintain consistency between data stored in the cache and data stored in the main database.
Cache memory is relatively expensive, so the size of cache must be limited. To ensure the best performance, various cache eviction algorithms, such as LIFO, FIFO, LRU, and LFU, are used to manage the cache. These algorithms determine which data should be removed from cache when space is needed.
Consistent hashing is a widely used concept in distributed systems because it offers flexibility in scaling the application. It is an improvement over traditional hashing methods, which are ineffective for handling requests over a network.
In consistent Hashing, users and servers are located in a virtual circular ring structure called a hash ring. The ring is considered infinite and can accommodate any number of servers, regardless of fixed allocation. Servers are assigned random locations based on a hash function. This concept allows for efficient distribution of requests or data among servers. It helps us achieve horizontal scaling, which increases throughput and reduces latency of the application.
CAP Theorem is an essential concept for designing networked shared data systems. It states that a distributed database system can only provide two of these three properties: consistency, availability, and partition tolerance. We can make trade-offs between three properties based on the use cases of our dbms system.
System design is an essential skill to have and is equally important from the interview point of view. One needs to be well aware of all the trade-offs while designing any system. In this blog, we tried to cover all the basic concepts necessary for getting started. We hope that after reading this blog, you’ll be now familiar with the basics of systems.
We are looking forward to your feedback in the message below. Enjoy learning, Enjoy system design!
Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.
©2023 Code Algorithms Pvt. Ltd.
All rights reserved.