How to build Recommender System using Machine Learning?

With the increased accessibility to the internet, each business sector has expanded at an extreme speed. We must be aware of Amazon, Flipkart, YouTube, Instagram, and many more such platforms. But have we ever thought about the sole reason behind their success? The critical point is hidden behind offering a better customer experience using personalization. But now we must have been thinking about how do they work on personalization, right? So they usually focus on providing well-personalized content using a novel kind of system called Recommendation Systems.

Today almost every business focuses more on its customers by providing customer-oriented services and suggesting highly similar content to their behavior and actions. Many platforms like Netflix, Spotify, Amazon, YouTube, etc., usually try to increase their users' retention time to earn more, and all these things are possible only because of Recommendation Systems. It has transformed the usual way of interaction with the products and help customers make better choices. So let’s dive into the conceptual and the working principle of recommendation systems.

Key features of this blog

In this blog end-to-end description of Recommender System is presented. This blog answers these mentioned questions in a detailed manner:

  1. What is Recommender System and why they are used?
  2. What are different types of Recommender Systems and their pros and cons?
  3. How to implement any content-based recommender system?
  4. How are companies using Recommender System?

What is Recommender System?

It refers to a kind of system that could predict the future preference based on his/her previous behavior or by focusing on similar users' behavior and could recommend the top items from the available large pool of options. In a nutshell, Recommender systems are like salespeople who know us very well and suggest purchasing products based on our history, preferences, and what we like the most. 

In the past, people usually relied on local stores and have to choose from minimal options. There used to be only a few stores, and they could only afford limited varieties, but now people have too many options over the internet. 

How are Recommender Systems revolutionizing the world?

They are aimed to analyze customer choice better and constantly keep on revolutionizing itself based on user interactions. They have transformed the world using their user-oriented services in a friendly manner. Amazon or Netflix widely uses recommender systems to increase customer sales and retention by providing recommendations to various products based on their interest and choices. 

Recommender systems have transformed the traditional way of focusing on what customers want to buy next. They have become a booster to the sales of many similar products and create a better customer experience. As shown in Figure 1, if someone is looking to buy a Bluetooth neckband, then a recommender system will show them other neckbands and a list of more products with some lucrative offers.

recommendation system img 1

Traditionally people have to look across each product individually and go through the list of various individual items. If a customer searched for a neckband, they would be shown a neckband list from where they would pick one. But with the advent of recommender systems in place, they can easily select a neckband. They would automatically be shown another similar list of neckbands and a comparison between all the possible combinations, as shown in Figure 2. This gives rise to a better and informed choice for the customer, and as a result, their trust and satisfaction increase with the seller and the platform.

recommendation system img 2

Types of Recommender System

There are mainly six types of recommender systems based on their working principle. Some of the commonly used recommender systems are described below:

Content-Based Recommendation

Content-based recommender focuses on user historical behavior and works with data that the user provides to generate a user profile used to make recommendations. The system keeps on adapting itself to more and more ratings or inputs that users usually give.

content based recommendation


If someone searched for a Bluetooth neckband, he would be recommended based on his recent shopping trends. Figure 3 shows how this content-based recommendation system works.

Pros: In content-based systems, there is no cold start and popularity-based problem, and it can recommend items with rare features.
Cons:
Content-based systems performs poorly as compared to other recommender system and also suffers from over-specialization problem

recommendation system img 4

Collaborative Filtering

Collaborative filtering recommender focuses on the similarity in the behavior of users. It is assumed that people who have similar behavior or who have agreed in the past about some common things will agree in the future. The system tries to determine users/items based on their rating history. That history to be grouped and assigned based on their neighbourhood. 

collaborative filtering in machine learning

The collaborative filtering approach does not rely on analyzable machine content, and hence it can accurately recommend complex items. Figure 4 shows how Collaborative Filtering works.

Pros: No feature selection is required; hence it works for any item
Cons:
Collaborative Filtering method suffers from cold start problem, sparsity and popularity bias problem

recommendation system img 6

Hybrid Recommender System

Both Content-based and Collaborative filtering-based recommender systems have some pros and cons. So it would be better to leverage both content and collaborative data and use it as a Hybrid system for recommendations. Netflix recommendations are solely based on the Hybrid Recommender system. Whenever a new user subscribes to their service, they must have to rate the already seen content. Once the user began using the service, collaborative filtering suggests similar content to the customer.

Recommender Systems are continuously revolutionizing the customers' experience, and hence nowadays, more and more companies are adopting this technology to expand and generate more and more revenue.
In the next section, we’ll guide you on how to implement your own content-based recommender system. So, without any further delay, let’s implement.

How to implement a Content-Based Recommender System?

Let’s first look at the high-level view of implementing any simple Recommender system, and then we’ll dive a little deeper and look into the actual dataset. So let’s get started :).

Let’s build a recommender system that recommends some items to users based on their previous ratings. We’ll proceed as:

  1. We will start with the user and find out the set of items that they liked or purchased; we will build an item profile, which is a description of the item.
  2. From these Item profiles, we could infer a user profile, a description of the user.
  3. Once we have a user profile, we use it to match the pool of item profiles using some similarity measures and recommend the most similar items to the user.

How to build an Item profile?

For each item, create an item profile. Item profile consists of a set of important features like given below are the set of features of various kinds of Item profiles: 

  1. Movies: Author, Title, Actor, etc.
  2. Images: Tags, Metadata, etc.

How to build a User profile?

Suppose a user has rated items with profiles i1,i2,i3…..in. Then user profile is simply the normal or weighted average of the rated item profiles. Various aggregation methods are possible.

Once we have the User and Item profile ready, we can use them to make predictions using Cosine Similarity.

Suppose we have User profile x and Item profile i, we have to estimate the maximum value of U(x,i), which is given by:

 U(x,i)=cos($)=(x*i)/(|x||i|)

Greater the value of cos($), smaller the $, and, hence, closer the x and i.

Hence in this manner, we recommend items to users using Content-Based Recommender Systems. The figure below describes the complete overview of the content-based recommender system works.

recommendation system img 7

Complete Overview 

Implementation

Now we must have got a basic understanding of how a recommender system works. Here in this section, we’ll look into the actual dataset and working of the recommender system on some real-world examples of recommending movies. Here we are using TMDB 5000 movie dataset to build a Content-Based Recommender System.

Step 1: Dataset Description

In this system, we use the movies' contents, such as title, genre, cast, directors, etc., as the features to recommend similar movies. The dataset looks like as shown below.

recommendation system img 8

Step 2: Text Pre-processing

We need to pre-process the data in this step, i.e., converting to lower cases, removing punctuations, stemming, tokenization, etc. We can use the porter stemmer algorithm for this task. After performing data-cleaning, our dataset would look like this:

recommendation system img 9

Step 3: Generate Recommendations using TF-IDF and Cosine Similarity

After all the text-processing, now it’s time to generate word vectors of each feature. For this, we can use TF-IDF (Term Frequency-Inverse Document Frequency) to generate a matrix in which each column represents a word in the overview vocabulary. Each row represents a movie that will be used for calculating the similarity scores.

Here, Term Frequency is the relative frequency that a given word is present in a document and Inverse Document Frequency gives the relative count of documents containing a particular word. 

TF=(given word/total words)

IDF=log(number of documents/documents with given word)

Each word's overall importance to the documents in which
they appear is equal to TF * IDF.

*Note: We can also use CountVectorizer() instead of TF-IDF as TF-IDF down-weight an actor if he/she has acted in more movies.*

Once we have this importance matrix, we can use it to generate similarity measures. Here we are using Cosine Similarity for calculating the measures.

recommendation system img 10

Hence after calculating the similarity measures, we can easily recommend movies. Here are examples of how our recommender system is recommending movies to the user.

recommendation system img 11

recommendation system img 12

Performance Evaluation

There are two ways to evaluate a recommender system. They are Offline way and Online way.

Offline Way: Offline way measures the system's performance by splitting the data into training and validation sets. Offline Evaluators are of two types: Implicit and Explicit. Some metrics for offline evaluation are:

  • RMSE
  • NDCG@k
  • F1 Score

Online Way: It is employed to track down the recommendations and validate those through customer interaction after deploying the model. Metrics for online evaluation are:

  • Customer Lifetime Value 
  • Click-Through Rate
  • Return On Investment 

Many other ways are used to build Recommender Systems. Nowadays, deep learning-enabled models such as auto-encoder are also used to develop recommender systems. Moreover, Reinforcement based recommender systems are also deployed by major tech giants to give a state-of-the-art performance.

How are companies using Recommender System?

Nowadays, almost every company focuses on better user personalization and longer retention time. Recommender Systems offers a better consumer experience and also boosts the companies revenue. Let’s see how companies relied on this:

YouTube

We all must have used YouTube, but have we ever noticed that our home feed begins to show similar content if we search or watch any specific kind of video. If we haven’t noticed that yet, please do check, we will be surprised how using this method; they are trying to increase their users’ retention time. YouTube heavily personalizes recommendations based on a user’s viewing history and hence offers a better user experience.

Social Media (Instagram, TikTok)

If we are familiar with Instagram, TikTok, or the new Reel feature of Instagram, we must have noticed that the search section will show similar content if we like or spend some time over a certain kind of content. Tracking the user’s interaction tries to predict their behavior, which might be very useful for advertising any other third-party products.

Amazon

We must have interacted with Amazon, right! Whenever we buy anything or search for any product, we must have noticed some sections suggesting like “People who bought this also bought….” or other sections offering some lucrative offers to purchase more add-on items with the searched product. All these are possible because of Recommender Systems. Hence with such systems proved as a boon to both the consumers and the company. Consumers now have more and more options to choose from, and Companies are generating more revenue by offering more and more cheesy offers :) Yummy!

Netflix

Netflix and Chill! : — Most common Lingo used by people nowadays :) Recently Netflix announced two days of free access to Netflix content. We must have tried that, right! But have we ever thought about what their underlying intuition behind this is? They are trying to figure out how the unreachable customers are searching from the large content pool and what kind of content is most demanding. Hence by observing such behavior, they will create new content and recommendations based on our behavior. Don’t it looks interesting, right! So yeah, in a nutshell, Netflix heavily relies on Recommendation systems for creating new content and increasing their revenue. We can think of the importance of the recommender system to Netflix by the amount of prize money they offered to a team that beat Netflix’s recommender system by 10%. So let’s dive into the details of “The Netflix Challenge.”

Case Study: The Netflix Prize

“The Netflix Prize” is the competition launched by Netflix between 2003 to 2006. This challenge motivated the researchers to develop a novel recommender system that could beat the Netflix system by more than 10%. Netflix sponsored the competition offered a grand prize of US $1,000,000 to the team who could surpass Netflix’s existing recommender system with 10% more accuracy. For this competition, a dataset of 100 million movies was offered to the teams to work on. On 21 September 2009, BellKor’s Pragmatic Chaos team was awarded the grand prize of US $1,000,000. 

Netflix’s challenge boomed the research focused on recommender systems, and as a result, many companies came up with their own recommender systems to focus more on customer prospects.

Possible Interview Questions

  1. What are the steps that you took to pre-process your data?
  2. Why is a hybrid approach more beneficial in a recommendation system?
  3. How will you evaluate your recommendation system?
  4. What is the Porter stemmer algorithm?
  5. How is cosine similarity a good measure to know the similarity between two choices?

Conclusion

Recommender Systems have a very significant role in current industries. Using recommender systems, companies are now focusing more and more on users’ behavior for better user personalization and longer retention time. Recommender Systems are evolving day by day based on user interactions and their behaviors. They are the backbone of the E-Commerce industry. 
This article has just given the use cases of the Recommender Systems and how it plays a key role in various real-world scenarios and a better personalized experience. We hope you all liked it. Please do suggest in the comment your views on this blog. Till then,

If you have any ideas/queries/doubts/feedback, please comment below or write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy coding, Enjoy algorithms!

We welcome your comments

Subscribe Our Newsletter

Get well-designed application and interview centirc content on ds-algorithms, machine learning, system design and oops. Content will be delivered weekly.