Google Docs is an online word processor that is part of Google’s free, web-based Google Docs Editor package. It is a massive system with tons of features. If you spend a few minutes thinking about how Google Docs is built and how it works, you may realize that it is much more complex than it seems to be. Without any much delay let’s see how Google Docs work :)
Google docs is a large system with many features, including doc storage, sharing, formatting, editing, etc. The entire system can be broken down into these major components:
It’s always convenient to have a high-level overview of the system we’re trying to build. The best way to abstract your solution is by dividing an extensive system into smaller components. Let’s have an overview of these components.
Google Docs has a storage mechanism that is similar to an operating system. It contains concepts like directories, files, and owners, among others. As a result, a file object including content, parent, owner, and some additional metadata like creation date is a fundamental building component for such a system. The root directory has an empty parent, which denotes the folder relationship.
Google docs allow multiple users to work collaboratively on a single doc, but the question arises of how it achieves this. Let’s understand it here :).
There are two types of algorithms that are used for collaborative editing.
Google docs allow collaborative editing by using an event passing mechanism known as operational transformation, which maintains and synchronizes a consistent state between users working on a shared document. Suppose we have two users Sam and Aditi, simultaneously working on the same document, which has an initial state ‘BC’. When Sam and Aditi edit the doc, their respective changes will be sent to the server together with their last revision. More specifically, Aditi’s change is inserting the character “D” at position 2, which will be transformed to inserting at the 3rd index as Sam adds “A” in the beginning. Hence the server keeps track of the complete revision history of each document and allows each collaborator to edit the document. The figure below shows how operational transformation works :)
You can invite collaborators to Google docs, and we can keep a list of collaborators with relevant permissions such as read-only, owner, and so on. When someone wants to accomplish something specific, the system asks for his consent. However, for a better access control mechanism, we can use RBAC (Role-based Access Control).
RBAC (role-based access control) controls network access based on individual users’ responsibilities. Employees using RBAC have access to only the information they need to accomplish their jobs, and they can’t access information that doesn’t apply to them.
In the role-based access control data model, access can be limited to certain actions such as reading, creating, or editing files. Access can be restricted to particular actions, such as reading, producing, or changing files, based on various variables, including authorization, responsibility, and job expertise. RBAC-enabled solutions are more capable of safeguarding sensitive data and critical applications. RBAC has several advantages, including:
It’s pretty challenging to design a collaborative document editing platform. One difficulty is devising a system for allowing numerous persons to edit in the same area without conflicting edits. The second issue is ensuring that when many modifications occur simultaneously, each one is effectively merged.
The first difficulty in Google Docs is solved by operational transformation, and the second problem is solved by the collaborative protocol, which is the topic of this essay. You’ll need code running in two locations to open a Google document: your browser and our servers. The code that runs in your browser is referred to as a client. The client in the document editor processes all of your adjustments before sending them to the server and then processes other editors’ changes after receiving them from the server. Each client must maintain track of four pieces of information to interact in Google Docs:
The server remembers three things:
It’s possible to build client-server communication so that all editors can quickly process each other’s updates in real-time by carefully utilizing this information. Let’s have a look at how client-server communication is handled in a document with a simple example.
Let’s look into actual flow by considering an example:
The above example concludes with Bob and John receiving and acknowledging John’s change, respectively. The server and both editors are staring at the same page at this point — Hello Google!
So this is how Google Docs work :)
When the performance and availability of such systems decline, we want our application to monitor and troubleshoot by providing warnings. We can do this by incorporating google cloud into our observability and alerting procedures.
We can utilize third-party monitoring, alerting, and notification solutions to transmit observability and event data for our google cloud services. Google cloud services automatically provide observability data such as metrics, logs, and trace data, allowing for a comprehensive observability perspective. Data from cloud monitoring and cloud logging may be extracted and integrated into your current dependability and alerting systems.
In this blog, we tried to explain how Google Docs work. It’s tough to design sometimes real-time products similar to Google Docs. By reading this article, we hope that you gained some insight into what goes on behind the scenes in Google Docs and the types of considerations you should make to ensure a smooth collaborative experience. Please do share your views in the comments below.
Twitter is a social media platform where users may post and interact with “tweets.” Users submit and engage with “tweets” on Twitter, a microblogging and social networking site. Users can subscribe to other users’ feeds and receive tweet notifications from those they follow. Tweets are almost 140–280 character communications.
Bloom filter is a space-efficient data structure that tells whether an element may be in a set (either a false positive or true positive) or definitely not present in a set (True negative). It will take O(1) space, regardless of the number of items inserted. However, their accuracy decreases as more elements are added.
In this blog, we will learn about Caching, an important fundamental concept in system design. Have you ever experienced that when you open a website the first time, it takes more time than usual, but when you open the same website again, it takes no time to load? Do you know why this happens? Let's find out!
At its most basic level, a rate limiter restricts the number of events a certain object (person, device, IP, etc.) can do in a given time range. A rate limiter, in general, restricts the number of requests a sender can send in a given period of time. Once the cap is reached, Rate Limiter blocks requests.
Whenever we build any Web Application dealing with real-time data, we need to consider delivering data to the Client. While building such a Web Application type, one needs to consider the best delivery mechanism, right! We are presenting a series of three concept blogs focusing on data transfer between clients and servers. In this blog, we are focusing on Server-Sent Events, and here, we will give you a complete insight into its internal working and the underlying features.
Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops design and mathematics.