System design is one of the essential topics that large tech companies ask about during the interview process. On the other hand, it is also necessary for solving large-scale software problems.
After reading this blog, you will get familiar with the basic concept related to system design. Developing a good understanding of these topics helps you explore the system design at an advanced level and solve problems.
System design deals with designing a system that meets functional and business requirements. While designing systems, one needs to consider various trade-offs and choose between complexity vs. the system's performance.
Similarly, we need to consider many tradeoffs while considering the core functionality of the application that we are designing. As per our requirements, we need to decide upon: latency vs. throughput, performance vs. scalability, consistency vs. availability, etc.
Overall, system design is an open-ended discussion topic. That's why most top tech firms prefer to have one or two system design interview rounds. So let’s dive in to get familiar with the essential topics or concepts used in system design.
Availability is one way of ensuring the reliability of the system. It means the system should always remain up and return the response when any client fetches it for any operation. Whenever any user seeks the service, it remains available, and it must satisfy the user query.
Availability can be quantified by measuring the percentage of time the system’s services and functionalities remain operational in a given time window. We usually define the "availability percentages" for systems by the number of 9s of availability, as shown in the following table:
Availability of the service is generally increased by redundancy. Replication is used to offer redundancy by deploying a copy of the same server so that if one server goes down, it won’t affect the availability of the service.
Throughput is the maximum rate of transfer or capacity of the system. It is used as a metric to determine how much work the system can do in a given time frame. The throughput of the system can be increased by scaling the system. One more promising way to increase the Throughput of the System is by splitting up the requests and distributing them to various resources.
Proxy is a middle-man that sits between the client and server. When a client sends a request, it passes through the proxy and then reaches the server. Proxies are of two types: Forward Proxy and Reverse Proxy.
The forward proxy acts as a mask for clients and hides the client’s identity from the server, e.g., VPN. Similarly, Reverse Proxy acts as a mask for servers and hides the server’s identity from the response comes, e.g., Load Balancer.
Latency is a measure of the time duration to produce the result. Latency is the time to spend or lags in generating the desired output. Latency is the measure of the speed of the system. Lower the Latency, the higher the speed of the system.
Almost every system has an extreme dependency on the networks. Networks act as a platform for communications between user and server or among different servers. In contrast, Protocols are the set of rules that governs how the servers or machines communicate over the network. Some of the most common network Protocols are HTTP, TCP/IP, etc.
Databases are used for storage. Every system is associated with some amount of data that needs to be stored somewhere so that when required, that can be fetched easily, and hence databases are used. Databases are broadly classified into Relational and Non-Relational Databases.
Relational Databases are those that strictly enforce a strong relationship among the data. Relational Databases are highly structured in nature. Example: MySQL, PostgreSQL etc
Non Relational Databases have flexible structures, i.e., they are less rigid. Data is stored in an unstructured way, and these databases are generally used in services that require high speed and are distributed in nature. Example: Cassandra, Redis, etc
Database Partition is another critical concept used to improve the Latency and throughput of the service so that more and more requests can be entertained. Sharding is the way of dividing the database into smaller chunks to increase the performance of the service.
Here is the list of some popular databases based on Stack Overflow Developer Survey 2021.
Relational Databases and Non-Relational Databases ensure different types of compliance. Relational Databases are associated with ACID, while Non-Relational associated with BASE compliance.
ACID: Atomicity, Consistency, Isolation, Durability
ACID compliance ensures the relational nature of the databases and ensures the transaction happens in a guided manner. The transaction is an interaction with the databases.
BASE: Basically Available Soft State Eventual Consistency
BASE compliancy maintains the integrity of No-SQL databases and ensures the proper functioning of the NO-SQL databases, and is the main reason to build the No-SQL databases’ scalability.
While designing any application, one needs to be clear about the type of storage according to the system requirements. If the system is distributed in nature and speed, the system’s scalability is essential, then No-SQL databases are the best choice to go with. No-SQL databases are also preferred when the amount of data is huge. Simultaneously, SQL databases are favorable when the data structure is more important and generally preferred when complex queries and databases require fewer updates. However, there is always a trade-off while choosing between No-SQL vs. SQL database. Hence sometimes, according to business, a Polyglot architecture comprising of both SQL and No-SQL databases is used to ensure the performance of the application.
Whenever your services grow, and more and more requests come to the system, then, in that case, your service will become slow and hence affect the performance. The best way to mitigate this is by scaling, i.e., increasing the potential of the system. There are two ways of doing so: Horizontal Scaling and Vertical Scaling.
Horizontal Scaling means scaling the service by adding more servers to your applications to distribute the requests. Similarly, Vertical Scaling corresponds to increasing the same machine’s power or capacity by upgrading it to handle more and more traffic.
Caching ensures the performance of the system and helps in reducing the Latency of the System. To make our application faster, it is convenient to store some of the frequently used data to be accessed in a lesser time to ensure the system’s speed. A cache is used to store these certain data pieces, so instead of querying the database, the data can be easily fetched from the cache. However, with the addition of cache, the system’s complexity increases. It is of utmost necessity to maintain synchronization between the data stored in the disk and the cache so that the system should remain consistent. Moreover, cache memory is too expensive; then, one can’t have a cache beyond a specific limit. Hence various data eviction algorithms like LIFO, FIFO, LRU, LFU, etc., are used to ensure the service’s performance.
A distributed system collects many independent machines that work together by coordinating with each other to achieve a common goal. Distributed Systems work in such a manner that it appears as a single entity to the outside user. Distributed systems operate concurrently and are highly scalable. Nowadays Distributed System is a necessity and almost every application relied on distributed computing. Distributed Systems are highly scalable, reliable, and offer low latency services.
Consistent Hashing is the most widely used concept in a distributed system as it offers considerable flexibility in the Scaling of the application. Consistent hashing is an improvement over normal Hashing. The traditional hashing method is ineffective in handling requests over a network. Here, inconsistent hashing, the user and servers are located virtually in a circular ring structure called the Hash ring. The ring is considered infinite and can accommodate any number of servers irrespective of no fixed allocation and assign them random locations based on some hash function. This concept allows the distribution of requests or data in the servers and their mapping to servers efficiently. It helps in achieving Horizontal Scaling and increases the throughput and Latency of the application.
CAP Theorem is one of the essential concepts necessary for designing networked shared data systems. CAP Theorem is an essential concept that helps make trade-offs between the three available functionalities, based on the unique use case that we need for our system.
Load Balancers are machines that balance the load among various servers. With Scaling, more and more servers are added to the system, and hence there must be a way to direct the requests to these servers in such a manner, so there is no heavy load on one server to prevent it from failure. Hence to deal with this, Load Balancers are used.
System Design is an essential skill to have and is equally important from the interview point of view at most top tech companies. One needs to be well aware of all the trade-offs while designing any system. In this blog, We tried to cover all the basic concepts necessary for getting started. We hope that after reading this blog, you’ll be now familiar with the basics of systems.
Enjoy learning, Enjoy system design!
A key-value database is a non-relational database (NoSQL) that stores data using a simple key-value mechanism. Its structure is similar to maps or dictionaries, where each key is associated with one value only. The simplicity of this model makes a key-value database fast, easy to use, scalable, portable, and flexible.
Databases are a critical component of the world’s most complex technology systems, and how they are used has a significant impact on their performance, scalability, and consistency. Because this is an essential topic with many moving components, hence in this article, we’ve outlined the most crucial database topics that you’ll need to know during a system design interview.
Google Docs is an online word processor that is part of Google’s free, web-based Google Docs Editors package. It is a massive system with tons of features. If you spend a few minutes thinking about how Google Docs is built and how it works, you may realize that it is much more complex than it seems to be. Without any much delay let’s see how Google Docs work :)
Ever wondered how does 1-click-buy works on Amazon? How does an e-commerce platform show the status of your order after the order is placed? What happens when you cancel your order right after you place an order, or after your item is shipped, or even delivered? How is all the activity related to an order tied to just one order Id? This blog will try to tackle such system design challenges and lay out key insights on designing a workflow system.
Instagram is a photo and video-sharing social media platform that allows users to share their creations with others. The original poster can set the visibility of these posts (photos/videos) to private or public. Posts can be liked and commented on by users. Users can follow and see the news feeds of other users (a collection of posts from the users they are following).
Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops design and mathematics.