Implement Least Recently Used (LRU) Cache

Difficulty: Hard, Asked-in: Amazon, Microsoft, Adobe, Google

Key takeaway: an excellent algorithm to learn data structure design and problem-solving using hash tables and doubly-linked lists.

Understanding LRU cache problem

The Least Recently Used (LRU) is one of the popular caching strategies, which defines the policy to discard the least recently used items first from the cache and make room for new elements when the cache is full. It is used to organize items in order of their use, which allows identifying items that have not been used for a long time.

So our goal is to design a data structure that follows the constraints of a Least Recently Used (LRU) cache. We need to implement LRUCache class with the following operations:

  • LRUCache(int capacity): Initialize LRU cache with positive size capacity.
  • int get(int key): Return the value of key if key exists, otherwise, return -1.
  • void put(int key, int value): Update the value of key if key exists. Otherwise, add key-value pair to the cache. If number of keys exceeds the capacity of lru cache, evict the least recently used key.

We want the following specifications from our LRU cache:

  • Access time for any item should be O(1).
  • Time required to get the least recently used element should be O(1).
  • The space required should be O(n).

Let's understand the problem via an example

Suppose we have five elements in the main memory, A1 to A5. And let the size of our cache be 3.

LRU cache example: step 1

Initially, the cache is empty, and all the elements are stored in memory. We want to get A1 first. We get the value of A1 from memory and store it in the cache.

LRU cache example: step 2

Next, we want to get A2. A2 gets stored at the topmost level, and A1 is moved down as it is no longer the most recently used element.

LRU cache example: step 3

Next, we want to get A3.

LRU cache example: step 4

Now suppose we want to get A2 again. Instead of getting this from memory, we can get this from our cache. Notice that the position of A2 is at the top again, as A2 is the most recently used element now.

LRU cache example: step 5

Now we want to get A4. We have to get it from memory. But where will we store it in our cache? We have to remove some elements so that we can keep A4. So, we remove the least recently used element, A1, in this case.

LRU cache example: step 6

Since our cache can store only three elements, we need to discard the least recently used element from our cache.

Before designing the implementation of the LRU cache, we will look at the need for a cache. Generally, retrieving data from a computer’s memory is an expensive task. A high-speed memory known as cache memory is used to avoid accessing data from memory repeatedly. A cache holds frequently requested data and instructions to be immediately available to the CPU. Thus, cache memory reduces the average time for accessing data from the main memory.

The cache memory size is generally much smaller than the main memory. So we cannot fit everything from the main memory into the cache. There are different ways of handling this; LRU cache is one such way. The main idea of the LRU cache is to store the n recently accessed elements (assume that the size of the cache is n).

A brute force approach: LRU cache using a simple array

We initialize an array of sizes equal to that of our cache. Here each data element stores extra information to mark with an access time stamp. It shows the time at which the key is stored. We will use the timeStamp to find out the least recently used element in the LRU) cache.

class DataElement 
{
    int key
    int value
    int timeStamp
 
    public DataElement(int k, int data, int time)
    {
        key = k
        value = data
        timeStamp = time
    }
}

int get(int key): we traverse the array and compare the key of each data element with the given key. If we find the key of an element equal to the key, we set the time-stamp of that data element to 0 and return the value of that data element. If we don’t find any such element, we return -1. Time complexity = O(n)

void set(int key, int value): when the array is not full, we increase the time stamp of the existing data in the array by 1, set the time-stamp of the new data to 0, and insert it into the array. When the array space is full, we must delete the least recently used element or data with the largest timestamp. For this, we will iterate through the array and find the element with the largest timeStamp. We will simply insert the new element (with a new key and value) at the place of the element with the largest timeStamp. Time complexity = O(n)

A brute force approach: LRU cache using a singly linked list

int get(int key): we traverse the linked list and find a data element with a key equal to the key. If it is found, we move the data element to the head of the linked list and return its value. Otherwise, we return -1. Time complexity: O(n)

void set(int key, int value): If the length of the linked list is not equal to the size of our cache, we simply insert the new data element into the head of the linked list. Otherwise, we also need to delete the least recently used element, which is present at the tail of the linked list. Time complexity: O(n)

The critical question is: the time complexity of both get and set operations in both approaches is O(n), which is inefficient and does not satisfy the specification given in the problem. It's an exercise for you to write the implementation code for both approaches!

An efficient approach using a hash map and doubly linked list

Solution idea

When we look at our requirements, we can conclude that we need a set to keep track of the elements present in the cache. But we also need to store them in a specific order, i.e. the least recently used item should be at the bottom, and we should be able to move any item to the top in constant time. So we need to use a doubly-linked list for this. We also need to access an element from the cache in O(1) time. Using the doubly linked list, we will require O(n) time to access any element. Thus, we will need a hash map to map the items to linked list nodes. This allows us to find an element in the linked list in O(1) time.

When inserting new data, insert it into the head of the linked list and record it with a map; after each cache hit, transfer the data (nodes) to be accessed to the head of the linked list; when the linked list is full, discard the tail of the linked list and delete the corresponding map key.

How LRU cache works?

Solution steps

Following are the steps we need to take whenever we access any element:

  1. Look up the element in our hash map.
  2. If the element is present in the hash table, it is present in our cache, and we do not need to access it from memory. This is known as a cache hit.
  3. Find the corresponding linked list node for this element from the hash map.
  4. Move this node to the top, as this element is the most recently used.
  5. If the element is not present in the hash table, it is not in our cache. That means we need to get its value from memory and load it into our cache. This is known as a cache miss.
  6. If the cache is not empty, we need to remove the least recently used element node. It will be the last node of the linked list.
  7. Discard this element by removing the last node from the linked list and removing this node from the hash map.
  8. Now, create a new node for this element. Insert this at the top of our cache, i.e. the head of the linked list.
  9. Also, store this node in the hash map.

If we follow all these steps, we can update our cache in O(1) time as all these steps take O(1) time. Following is the implementation code of the LRU-cache.

LRU Cache Java Implementation

import java.util.Hashtable;
public class LRUCache {
    
    class Node {
        int key;
        int value;
        Node pre;
        Node post;
    }

    
    private Hashtable<Integer, Node> cache = new Hashtable<Integer, Node>();
    private int count;
    private int capacity;
    private Node head, tail;
    
    //Add the new node right after head
    private void addNode(Node node) {
        node.pre = head;
        node.post = head.post;
        
        head.post.pre = node;
        head.post = node;
    }

    //Remove an existing node from the linked list
    private void removeNode(Node node){
        Node pre = node.pre;
        Node post = node.post;
        
        pre.post = post;
        post.pre = pre;
    }

    //Move node in between to the head
    private void moveToHead(Node node){
        this.removeNode(node);
        this.addNode(node);
    }

    //Pop the current tail
    private Node popTail(){
        Node res = tail.pre;
        this.removeNode(res);
        return res;
    }


    public LRUCache(int capacity) {
        this.count = 0;
        this.capacity = capacity;

        head = new Node();
        head.pre = null;

        tail = new Node();
        tail.post = null;

        head.post = tail;
        tail.pre = head;
    }

    public int get(int key) {
        Node node = cache.get(key);

        if(node == null){
            return -1; 
        }

        this.moveToHead(node);
        return node.value;
    }


    public void put(int key, int value) {
        Node node = cache.get(key);

        if(node == null){
            Node newNode = new Node();
            newNode.key = key;
            newNode.value = value;

            this.cache.put(key, newNode);
            this.addNode(newNode);

            ++count;

            if(count > capacity){
                // pop the tail
                Node tail = this.popTail();
                this.cache.remove(tail.key);
                --count;
            }   
            
        }
        else {
            // update the value.
            node.value = value;
            this.moveToHead(node);
        }
    }
}

LRU Cache Python Implementation

#doubly linked list node
class Node:
    def __init__(self, k, v):
        self.key = k
        self.val = v
        self.prev = None
        self.next = None

        
class LRUCache:
    def __init__(self, capacity):
        #initialize LRU cache with size = capacity
        self.capacity = capacity
        self.hashMap = dict()
        
        #head and tail of the linkedlist
        self.head = Node("#", 0)
        self.tail = Node("-", 0)
        self.head.next = self.tail
        self.tail.prev = self.head

    def get(self, key):
        #returns the value of the key, if it exists in cache
        
        if key in self.hashMap:
            node = self.hashMap[key]
            #remove this node from doubly linked list
            self._remove(node)
            #move it at the start
            self._add(node)
            return node.val
        
        return -1

    def put(self, key, value):
        # update value of the key if it exists
        #this key becomes the most recently used one
        
        if key in self.hashMap:
            #if already present, cache hit
            self._remove(self.hashMap[key])
            
        
        newNode = Node(key, value)
        #add new node at the start
        self._add(newNode)
        
        self.hashMap[key] = newNode
        
        #if the number of keys exceeds the cacpcity, evict the least recently used key
        if len(self.hashMap) > self.capacity:
            #remove least recently used node
            nodeToRemove = self.tail.prev
            self._remove(nodeToRemove)
            del self.hashMap[nodeToRemove.key]

    def _remove(self, node):
        previousNode = node.prev
        nextNode = node.next
        previousNode.next = nextNode
        nextNode.prev = previousNode

    def _add(self, node):
        nextNode = self.head.next
        previousNode = self.head
        
        previousNode.next = node
        nextNode.prev = node
        
        node.next = nextNode
        node.prev = previousNode

Following are some of the pros and cons of LRU Cache.

Pros:

  1. Fast access: Our cache stores items in the most recently to least recently used order. This allows us to access the least recently used element and most recently used element in O(1) time.
  2. Fast updates: We can find every element from the cache in O(1) time, and thus, updating the cache takes O(1) time.

Cons:

  1. Space Complexity: We need to store the elements in two data structures, linked list, and hash map, both taking O(n) space.

Similar coding questions to practice

Enjoy learning! Enjoy coding!

Share feedback with us

More blogs to explore

Our weekly newsletter

Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.

© 2022 Code Algorithms Pvt. Ltd.

All rights reserved.