Design Google Docs

EnjoyAlgorithms Blog Cover Image

Google Docs is an online word processor that is part of Google’s free, web-based Google Docs Editors package. It is a massive system with tons of features. If you spend a few minutes thinking about how Google Docs is built and how it works, you may realize that it is much more complex than it seems to be. Without any much delay let’s see how Google Docs work :)

Key requirements

Google Docs is a vast system with many features, including doc storage, share docs, formatting, editing, and so on. The entire system can be broken down into the following major components:

  • The system allows users to organize files (documents) into folders and includes editing, creating, and removing functionalities. It functions as an operating system.
  • Online editing is without a doubt one of Google Docs’ most valuable features. The system should allow editing and format of the document.
  • Google Docs’ ability to allow numerous individuals to update a single document simultaneously is very remarkable. Our system should allow multiple users to collaborate and work simultaneously.
  • The system should share documents and assign different permissions to them (owner, read-only, allow comment, etc.).

High-Level Design

It’s always very convenient to have a high-level overview of the system we’re trying to build. The best way to abstract your solution is by dividing an extensive system into smaller components. Let’s have an overview of these components here:

Storage and Formatting

Google Docs has a storage mechanism that is remarkably similar to an operating system. It contains concepts like directories, files, and owners, among others. As a result, a file object including content, parent, owner, and some additional metadata like creation date is a fundamental building component for such a system. The root directory has an empty parent, which denotes the folder relationship.

Concurrency

Google Docs allows multiple users to work collaboratively on a single doc, but the question arises of how it achieves this. Let’s understand it here :) 

There are two types of algorithms that are used for Collaborative Editing.

  1. Conflict-free Replicated Data Types (CRDTs)
  2. Operational Transformation (OT)

Google Docs allows collaborative editing by using an event passing mechanism known as Operational Transformation, which maintains and synchronizes a consistent state between users working on a shared document. The way how OT works is explained in the image below.

Suppose we have two users Sam and Aditi, simultaneously working on the same document, which has an initial state ‘BC’. When Sam and Aditi edit the doc, their respective changes will be sent to the server together with their last revision. More specifically, Aditi’s change is inserting the character “D” at position 2, which will be transformed to inserting at the 3rd index as Sam adds “A” in the beginning. Hence the server keeps track of the complete revision history of each document and allows each collaborator to edit the document. The figure below shows how Operational Transformation works :)

Access Control

You can invite collaborators to Google Docs, and we can keep a list of collaborators with relevant permissions such as read-only, owner, and so on. When someone wants to accomplish something specific, the system asks for his consent. However, for a better access control mechanism, we can use RBAC (Role-based Access Control).

RBAC (role-based access control) controls network access based on individual users’ responsibilities. Employees using RBAC have access to only the information they need to accomplish their jobs, and they can’t access information that doesn’t apply to them.

In the role-based access control data model, access can be limited to certain actions such as reading, creating, or editing files. Access can be restricted to particular actions, such as reading, producing, or changing files, based on various variables, including authorization, responsibility, and job expertise. RBAC-enabled solutions are more capable of safeguarding sensitive data and critical applications. RBAC has several advantages, including:

  • Increasing operational effectiveness
  • Increased visibility of the owner
  • Cost Reduction
  • Reducing the risk of data breaches

How do Google Docs work?

It’s pretty challenging to design a collaborative document editing platform. One difficulty is devising a system for allowing numerous persons to edit in the same area without conflicting edits. The second issue is ensuring that when many modifications occur simultaneously, each one is effectively merged.

The first difficulty in Google Docs is solved by operational transformation, and the second problem is solved by the collaborative protocol, which is the topic of this essay. You’ll need code running in two locations to open a Google document: your browser and our servers. The code that runs in your browser is referred to as a client. The client in the document editor processes all of your adjustments before sending them to the server and then processes other editors’ changes after receiving them from the server. Each client must maintain track of four pieces of information to interact in Google Docs:

  1. The most recent revision was delivered from the server to the client.
  2. Any modifications are done locally that have not yet been communicated to the server.
  3. Any modifications performed locally are communicated to the server, but the server has not yet acknowledged them.
  4. The current state of the document is visible to that editor.

The server remembers three things:

  1. This is a list of all the modifications it has received but has not yet been processed.
  2. The complete history of all changes that have been made.
  3. The document’s current state as of the latest processed update.

It’s possible to build client-server communication so that all editors can quickly process each other’s updates in real-time by carefully utilizing this information. Let’s have a look at how client-server communication is handled in a document with a simple example.

Workflow

Let’s look into actual flow by considering an example:

  1. Let’s say Bob begins the text by entering the word Hello at the top.
  2. Bob keeps typing and adds the term world to his paper. John types an ‘!’ in his empty version of the document at the exact moment.
  3. The edit was added to Bob’s client’s list of pending revisions. He then sent the update to the server and added it to his sent changes list.
  4. John got Bob’s edit from the server and transformed it against his pending (‘!’)update using operational transformation (OT). As a result of the transition, John’s pending change was moved up five spaces to create a place for Bob’s Hello at the top of the document. When Bob and John received the notifications from the server, they both changed their last synced revision numbers to 1.
  5. Following that, both Bob and John will send their pending changes to the server.
  6. Because Bob’s change arrived before John’s, the server processed it first. Bob received a confirmation of the adjustment. The modification was delivered to John, who had it changed against his (‘!) change, which was still waiting.
  7. The server received John’s pending modification, and John believes it should be Revision 2. The server, however, has already added Revision 2 to the revision log. The server will apply OT to John’s patch and save it as Revision 3.
  8. The server started by comparing John’s transmitted change to all the other modifications committed since the last time John synced with the server. It turned John’s change against Bob’s in this example (‘Google’ at 6). As a result, John’s change over-index was pushed by 6. This shift is identical to John’s client’s metamorphosis when he first received Bob’s (‘Hello’ at 1).

The above example concludes with Bob and John receiving and acknowledging John’s change, respectively. The server and both editors are staring at the same page at this point — Hello Google!

So this is how Google Docs work :)

Optional

This section explains monitoring and observability. This is purely optional. If you want you can skip this section :)

When the performance and availability of such systems decline, we want our application to monitor and troubleshoot by providing warnings. We can do this by incorporating Google Cloud into our observability and alerting procedures.

We can utilize third-party monitoring, alerting, and notification solutions to transmit observability and event data for our Google Cloud services. 
Google Cloud services automatically provide observability data such as metrics, logs, and trace data, allowing for a comprehensive observability perspective. Data from Cloud Monitoring and Cloud Logging may be extracted and integrated into your current dependability and alerting systems.

Conclusion

In this blog, we tried to explain how Google Docs work. It’s tough to design sometimes real-time products similar to Google Docs. By reading this article, we hope that you gained some insight into what goes on behind the scenes in Google Docs and the types of considerations you should make to ensure a smooth collaborative experience. Please do share your views in the comments below.

We'd love to hear from you

More content from EnjoyAlgorithms

Our weekly newsletter

Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops and math. enjoy learning!