Have you ever experienced a website taking longer to load on the first visit but becoming significantly faster on subsequent visits? This is a common phenomenon related to caching - a critical concept in system design. In this blog, we will discuss how caching helps us to enhance the performance of a system.
Suppose there is a library with thousands of books and a librarian who helps students find and borrow books. When a student arrives and requests a book, the librarian goes to the storeroom to retrieve it, brings it back to the desk, and issues it to a student. When a student returns the book, the librarian puts it back in its place in the storeroom. If another student arrives and requests the same book, the librarian goes to the storeroom again to retrieve it. In this system, the librarian must visit the storeroom every time a student requests a book.
Now, let's consider a scenario where the librarian has a bag that can hold 15 books, serving as a cache for recently returned books. When a student arrives and requests a book, the librarian goes to the storeroom and retrieves it, just as before. But when the student returns the book, the librarian places it in the bag instead of returning it to the storeroom. If another student requests the same book, the librarian checks the bag and finds the book there. This means the librarian does not have to go to the storeroom again, and the student can be served more efficiently.
Caching is a technique for improving the speed of data retrieval by storing data or the results of a request in a temporary storage location called cache. Cache is high-speed data storage that stores a small proportion of critical data, specifically the most frequently requested data.
The purpose of caching is to make data retrieval more efficient by reusing previously retrieved data rather than retrieving it again from a slower storage (Disk or RAM). This will improve system performance because time required to retrieve data get significantly reduced.
Cache hits and misses are critical metrics for measuring the performance and effectiveness of a cache. A high cache hit rate means cache is effectively storing and retrieving the most frequently accessed data. On another side, a high cache miss rate suggests that the cache is not being used effectively. So we need to adjust the cache size or use some other replacement policy.
Now there are several questions to think about and explore:
We can use several types of caching to improve performance of a system.
If you're wondering how websites load quickly and efficiently, one of the reasons is browser caching. This involves temporarily storing resources such as images, HTML, JavaScript files within a cache in web browser.
Web server caching is used to improve performance of a website by storing resources on the server side. This reduces load on the server. There are several ways to implement web server caching i.e. reverse proxy cache and key-value store such as Memcached or Redis.
Content Delivery Network (CDN) is a system that are designed to improve delivery speed of static content, such as web pages, images, videos, and other media files. These proxy servers are located in strategic locations around the world to reduce distance between end user and origin server (reducing latency).
Think CDN like a chain of grocery stores: Instead of going all the way to farms where food is grown, which could be hundreds of miles away, customers can go to their local grocery store. Grocery store stocks food from faraway farms, allowing customers to get the food they need in a matter of minutes, rather than days.
Distributed cache is a system that pools together random-access memory (RAM) of multiple networked machines into a single in-memory data store that is used as a cache. Unlike traditional caches, which are usually limited to the memory of a single machine, distributed cache can scale beyond the memory limits of a single machine by linking together multiple machine (distributed cluster) to increase capacity and processing power.
Distributed caches are useful in environments with high data volume and load, as distributed architecture allows for incremental expansion by adding more machines to the cluster. This helps cache to grow with the data growth and handle large amounts of data efficiently.
Database caching is a technique that store frequently accessed data in a cache to improve performance of a database. This can significantly increase throughput and reduce data retrieval latency, resulting in improved overall performance for the application. A database cache layer can be implemented in front of any database, including relational and NoSQL databases.
Cache eviction policies are algorithms or strategies that manage data stored in a cache. When cache is full, some data needs to be removed in order to make room for new data. So cache eviction policy determines which data to remove based on certain criteria.
There are several common cache eviction policies:
When data in the database is constantly being updated, it is important to ensure that cache is also updated to reflect these changes. Otherwise, application may serve outdated or stale data to clients. So, we use cache invalidation techniques to maintain consistency of the cache with latest changes in the database. There are three different types of popular cache invalidation schemes:
In write-through caching, writes are first made to the cache and then to the database. If both writes are successful, write operation is considered successful. This approach ensures that cache and database remain consistent and reduces the risk of data loss in case of a crash or system disruption.
In write-back cache, we write data directly to database and bypass the cache. This leads to increase in cache miss, resulting in higher read latency for applications that frequently write and re-read data.
The write-back cache is used in systems with high write activity to improve write performance. Writes are temporarily stored in a cache layer, where they are quickly verified and then asynchronously written to the database. This approach results in faster write latencies and higher write throughput.
However, this technique carries the risk of data loss if cache layer fails, as cache is the only copy of the written data. To minimize this risk, it is recommended to have multiple cache replicas that acknowledge the write. This way, if one cache fails, data can still be recovered from another cache.
There are several benefits to using caching in a system or application:
Caches are used in modern software to improve performance, but they can also have drawbacks.
In summary, caching is a useful technique for improving the performance, reducing cost, and increasing the scalability of a system or application by storing frequently accessed data in a fast and easily accessible location. It can also benefit users by improving the performance and efficiency of the system or application.
Thanks to Chiranjeev and Navtosh for his contribution in creating the first version of this content. If you have any queries/doubts/feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy system design, Enjoy algorithms!