Implement Least Recently Used (LRU) Cache

Difficulty: Hard, Asked-in: Amazon, Microsoft, Adobe, Google

Key takeaway: an excellent algorithm to learn data structure design and problem-solving using hash tables and doubly-linked lists.

Understanding the LRU Cache problem

The Least Recently Used (LRU) is one of the popular caching strategies, which defines the policy to discard the least recently used items first from the cache and make room for new elements when the cache is full. It is used to organize items in order of their use, which allows identifying items that have not been used for a long time.

So our goal is to design a data structure that follows the constraints of a Least Recently Used (LRU) cache. We need to implement the LRUCache class with the following operations:

  • LRUCache(int capacity): initialize the LRU cache with positive size capacity.
  • int get(int key): return the value of the key if the key exists, otherwise, return -1.
  • void put(int key, int value): update the value of the key if the key exists. Otherwise, add the key-value pair to the cache. If the number of keys exceeds the capacity of this operation, evict the least recently used key.

We want the following specifications from our LRU cache:

  • Access time for any item should be O(1).
  • Time required to get the least recently used element should be O(1).
  • The space required should be O(n).

Let's understand the problem via an example

Suppose we have five elements in the main memory, A1 to A5. And let the size of our cache be 3.

LRU cache example: step 1

Initially, the cache is empty, and all the elements are stored in memory. We want to get A1 first. We get the value of A1 from memory and store it in the cache.

LRU cache example: step 2

Next, we want to get A2. A2 gets stored at the topmost level, and A1 is moved down as it is no longer the most recently used element.

LRU cache example: step 3

Next, we want to get A3.

LRU cache example: step 4

Now suppose we want to get A2 again. Instead of getting this from memory, we can get this from our cache. Notice that the position of A2 is at the top again, as A2 is the most recently used element now.

LRU cache example: step 5

Now we want to get A4. We have to get it from memory. But where will we store it in our cache? We have to remove some elements so that we can keep A4. So, we remove the least recently used element, A1, in this case.

LRU cache example: step 6

Since our cache can store only three elements, we need to discard the least recently used element from our cache.

Before designing the implementation of the LRU cache, we will look at the need for a cache. Generally, retrieving data from a computer’s memory is an expensive task. A high-speed memory known as cache memory is used to avoid accessing data from memory repeatedly. A cache holds frequently requested data and instructions to be immediately available to the CPU. Thus, cache memory reduces the average time for accessing data from the main memory.

The cache memory size is generally much smaller than the main memory. So we cannot fit everything from the main memory into the cache. There are different ways of handling this; LRU cache is one such way. The main idea of the LRU cache is to store the n recently accessed elements (assume that the size of the cache is n).

A brute force approach: LRU Cache using a simple array

We initialize an array of sizes equal to that of our cache. Here each data element stores extra information to mark with an access time stamp. It shows the time at which the key is stored. We will use the timeStamp to find out the least recently used element in the LRU) cache.

class DataElement 
{
    int key
    int value
    int timeStamp
 
    public DataElement(int k, int data, int time)
    {
        key = k
        value = data
        timeStamp = time
    }
}

int get(int key): we traverse the array and compare the key of each data element with the given key. If we find the key of an element equal to the key, we set the time-stamp of that data element to 0 and return the value of that data element. If we don’t find any such element, we return -1. Time complexity = O(n)

void set(int key, int value): when the array is not full, we increase the time stamp of the existing data in the array by 1, set the time-stamp of the new data to 0, and insert it into the array. When the array space is full, we must delete the least recently used element or data with the largest timestamp. For this, we will iterate through the array and find the element with the largest timeStamp. We will simply insert the new element (with a new key and value) at the place of the element with the largest timeStamp. Time complexity = O(n)

A brute force approach: LRU Cache using a singly linked list

int get(int key): we traverse the linked list and find a data element with a key equal to the key. If it is found, we move the data element to the head of the linked list and return its value. Otherwise, we return -1. Time complexity: O(n)

void set(int key, int value): If the length of the linked list is not equal to the size of our cache, we simply insert the new data element into the head of the linked list. Otherwise, we also need to delete the least recently used element, which is present at the tail of the linked list. Time complexity: O(n)

The critical question is: the time complexity of both get and set operations in both approaches is O(n), which is inefficient and does not satisfy the specification given in the problem. It's an exercise for you to write the implementation code for both approaches!

An effcient approach using a hash map and doubly linked list

Solution Idea

When we look at our requirements, we can conclude that we need a set to keep track of the elements present in the cache. But we also need to store them in a specific order, i.e. the least recently used item should be at the bottom, and we should be able to move any item to the top in constant time. So we need to use a doubly-linked list for this. We also need to access an element from the cache in O(1) time. Using the doubly linked list, we will require O(n) time to access any element. Thus, we will need a hash map to map the items to linked list nodes. This allows us to find an element in the linked list in O(1) time.

When inserting new data, insert it into the head of the linked list and record it with a map; after each cache hit, transfer the data (nodes) to be accessed to the head of the linked list; when the linked list is full, discard the tail of the linked list and delete the corresponding map key.

Solution Steps

Following are the steps we need to take whenever we access any element:

  1. Look up the element in our hash map.
  2. If the element is present in the hash table, it is present in our cache, and we do not need to access it from memory. This is known as a cache hit.
  3. Find the corresponding linked list node for this element from the hash map.
  4. Move this node to the top, as this element is the most recently used.
  5. If the element is not present in the hash table, it is not in our cache. That means we need to get its value from memory and load it into our cache. This is known as a cache miss.
  6. If the cache is not empty, we need to remove the least recently used element node. It will be the last node of the linked list.
  7. Discard this element by removing the last node from the linked list and removing this node from the hash map.
  8. Now, create a new node for this element. Insert this at the top of our cache, i.e. the head of the linked list.
  9. Also, store this node in the hash map.

If we follow all these steps, we can update our cache in O(1) time as all these steps take O(1) time. Following is the code solution of the question LRU-Cache from Leetcode.

Solution code: LRU Cache Java Implementation

Least Recently Used cache Java Implementation

Solution code: LRU Cache Python Implementation

Least Recently Used cache python Implementation

Following are some of the pros and cons of LRU Cache.

Pros:

  1. Fast access: Our cache stores items in the most recently to least recently used order. This allows us to access the least recently used element and most recently used element in O(1) time.
  2. Fast Updates: We can find every element from the cache in O(1) time, and thus, updating the cache takes O(1) time.

Cons:

  1. Space Complexity: We need to store the elements in two data structures, linked list, and hash map, both taking O(n) space.

Similar Coding Questions to Practice

Enjoy learning! Enjoy coding!

More From EnjoyAlgorithms

Our weekly newsletter

Subscribe to get free weekly content on data structure and algorithms, machine learning, system design, oops design and mathematics.

Follow Us:

LinkedinMedium

© 2020 EnjoyAlgorithms Inc.

All rights reserved.