Have you ever experienced a website taking longer time to load on the first visit but significantly faster on subsequent visits? This is related to caching: An idea to improve the performance of distributed systems. In this blog, we will discuss the caching concept, how it works, and what are the important types of caching.
Caching is a process of storing data in a temporary storage location called cache to improve the speed of data access. Now the question is: What is cache? A cache is a high-speed storage that stores a small proportion of critical data so that future requests for that data can be served faster.
Let's understand this from another perspective! Accessing data from the main database will take a good amount of time (due to disk I/O or network latency). So when data is accessed or processed for the first time, the system will store that data in a faster memory i.e. cache. If there is a further request for the same data, the system will serve it directly from the cache.
Cache memory is very costly, so its size is much small compared to the size of the main database. So what type of data should we store in the cache? One basic idea is to store the most frequently accessed data. Why? We highly recommend to explore the reason.
So one of the critical goals of caching is to make data retrieval fast by accessing frequently accessed data from the cache instead of retrieving it again from the main database. This will take significantly less time, which will improve the performance.
Now let's understand how caching works in practice when a user requests some data.
The above idea is one of the commonly used caching strategies: cache-aside strategy. Here cache and database are independent and it is the responsibility of the application code to manage operations on cache and database. This is useful when we are dealing with read-heavy systems.
There can be other types of caching strategies like read-through strategy, write-around strategy, write-back strategy and write-through strategy. We will discuss these concepts later in this blog.
Note: Cache hits and misses are critical metrics for measuring cache performance.
Cache eviction policies are algorithms that manage data stored in a cache. When the cache is full, some data needs to be removed in order to make room for new data. So cache eviction policy determines which data to remove based on certain criteria. There are several cache eviction policies:
When data in the database is constantly being updated, it is important to ensure that the cache is also updated to reflect these changes. Otherwise, the application will serve outdated or stale data. So, we use cache invalidation techniques to maintain the consistency of the cache.
We mostly use five types of caching strategy to solve the problem of cache invalidation: Cache-aside strategy, Read-through strategy, Write-through strategy, Write-around strategy and Write-back strategy. We have already discussed the cache-aside strategy in the above section. So let's discuss the rest of them.
In this strategy, cache is present between the application code and the database. In case of a cache miss, it retrieves data from the database, stores that data in the cache and returns it to the application. So read-through caches are also one of the good choices for read-heavy systems.
Both cache-aside and read-through look similar, but there are differences. In cache-aside, application code is responsible for retrieving data from the database and storing it in the cache. In read-through, this is supported by the cache provider. So this strategy simplifies the application code by abstracting away the complexity of cache management.
In both strategies, the most basic approach is to write directly to the database. Due to this, cache will be inconsistent with the database. So what is the solution?
In write-through cache, writes are first made to the cache and then to the database. If both writes are successful, write operation is considered successful. This will ensure that cache and database remain consistent and reduces the risk of data loss in case of a crash or system disruption.
In write-around cache, write operations bypass the cache and go directly to the database rather than being written to the cache first. This technique does not ensure that data in the cache is always up-to-date because updated data may not be available in the cache.
The write-back cache is used in systems with high write activity to improve write performance. Writes are temporarily stored in a cache layer, where they are quickly verified and then asynchronously written to the database. This results in lower write latency and higher write throughput.
However, this technique carries the risk of data loss if cache layer fails, because cache is the only copy of the written data. To minimize this risk, it is recommended to have multiple cache replicas that acknowledge the write. This way, if one cache fails, data can still be recovered from another cache.
Let's move forward to understand some popular approaches to caching used in distributed systems. We apply these approaches at various stages to improve performance and scalability.
When you revisit the same website, the browser will retrieve these resources from the cache, rather than downloading them from the network again. This will result in a faster page load time and reduce the amount of data that needs to be transmitted. This is also known as client-side caching.
Browser cache has limited capacity and is set to store resources for a specific time duration. When the cache reaches its capacity or resources reach their expiration date, the browser will automatically clear the cache and retrieve updated resources from the network during the next visit. Users can also manually clear their browser cache if it becomes full.
Web server caching is used to improve performance by storing resources on the server side. This reduces the load on the server. There are several ways to implement web server caching i.e. reverse proxy cache and key-value store such as Memcached or Redis.
The reverse proxy cache acts as an intermediary between the browser and the web server. When a user makes a request, the reverse proxy cache checks if it has a copy of the requested data. If yes, it will serve a cached version to the user rather than forwarding the request to the web server.
We can also use the key-value database to cache application data. These databases are typically accessed by the application code. Unlike reverse proxies, which cache HTTP responses for specific requests, key-value databases can cache any user-specific data or frequently accessed data based on need.
Content Delivery Network (CDN) is designed to improve the delivery speed of static content like web pages, images, videos, and other media files. These proxy servers are located in strategic locations around the world to reduce the distance between the end user and the origin server (reducing latency).
When a user requests content from a website that uses a CDN, CDN fetches the content from the origin server and stores a copy of it. If the user requests the same content again, CDN serves content directly rather than fetching it again from the origin server.
Think CDN like a chain of grocery stores: Instead of going all the way to farms where food is grown, which could be hundreds of miles away, customers can go to their local grocery store. Grocery store stocks food from faraway farms, allowing customers to get the food they need in a matter of minutes, rather than days.
Distributed caching is the practice of using multiple caching servers that are spread across a network. Unlike traditional caches, which are usually limited to the memory of a single machine, distributed cache can scale beyond the memory limits of a single machine by linking together multiple machines (distributed clusters).
In distributed caching, each caching server maintains a portion of the cached data, and requests for data are directed to the appropriate server based on a hashing algorithm or some distribution strategy.
By storing frequently accessed data, caching reduces the need to retrieve data from slower data sources. This will result in faster access, quicker response times and better user experience.
In summary, caching is a useful technique for improving performance, reducing cost, and increasing the scalability of a system by storing frequently accessed data in fast storage called cache!
Thanks to Chiranjeev and Navtosh for their contribution in creating the first version of this content. If you have any queries or feedback, please write us at firstname.lastname@example.org. Enjoy system design!
Subscribe to get well designed content on data structure and algorithms, machine learning, system design, object orientd programming and math.
©2023 Code Algorithms Pvt. Ltd.
All rights reserved.