Design Facebook Newsfeed

With the emergence of Smartphones and increased accessibility to the internet, Social media platform such as Facebook, Whatsapp, Youtube, etc., has emerged a lot. It has become a necessity of today’s generation. It is found that an average US adult spends around 38 minutes per day on Facebook. In 2019, Internet users spent an average of 142 minutes per day on social media. The increased engagement of the people in the social media platform has inspired many businesses to grow and provide personalized services to consumers to enhance their profits.

In this blog, we’ll guide you on designing a necessary Facebook component that is its News Feed service. Facebook News Feed service contains the photos, comments, posts, etc., shared by the people and pages that the user follows. So, let’s look at what is News Feed service is and get started:)

What is Facebook News Feed?

Whenever you open Facebook, the News Feed is the first service that you encounter. It is a continually updating service consisting of posts, photos, videos, comments, and activities of the people and pages you follow. Every Social media platform, such as LinkedIn, Twitter, Youtube, Instagram, etc., has a News Feed service, the main landing page, and displays real-time updates. Many companies also use Social Media platforms for advertising their products and using Machine Learning to offer customized and relevant advertisements to the user to increase their profits.

Over the past few years, social media platform such as Facebook has scaled tremendously. News Feed is a vital component of Facebook as it directly affects user engagement and constant interaction. Hence Facebook teams are highly concerned about every use case and aim to provide real-time services. Without much, let’s look into the critical requirement of the Facebook News Feed service.

Key Requirements

Facebook is one of the most widely used social networking platforms; hence it is necessary that it offers real-time services and remains reliable and available. Following are key features that must be taken care of while designing such a large scale service:

  1. Users can have many friends and followers
  2. News Feed service consists of photos, videos, and updates of the people and the pages that a user follows
  3. Service should be able to append any new post as they arrive for all the active users
  4. The system should be able to generate the user’s news feed in real-time.

Once all the service’s key requirements are identified, the step would be to estimate the system’s capacity. Let’s look into the Capacity Estimation necessary for our News Feed Service.

Capacity Estimation

Let's assume that we have 500M total active users and each user uses Facebook five times a day.

  • So on average News Feed request per day = 500M * 5 = 2.5B requests per day
  • Let's suppose we have 1000 posts per user to appear in his feed and the average size of each post is 10KB. Total storage = 500M * 1000 * 10 = 5000TB
  • Let's assume each server stores 100GB of data, then the total servers required = 5000TB/100GB = 50,000 servers

High-Level Design

Facebook has a vast user base; hence, there can be many requests simultaneously in the system. Hence, the system should be efficient enough to respond with minimum latency and provide real-time services. It is necessary to get the feed in real-time using some well-defined APIs. We can use any of REST or SOAP architecture for the same.

getfeed(api_key, user_id, since, count)
Parameters:
1) api_key  :  API developer key of the API
2) user_id  :  ID of the user
4) count    :  Number of news feed to be generated
Return      :  Returns a JSON object containing list of feeds

At a high level, the Facebook News Feed service comprises of two parts: Feed Generation and Feed Publishing.

Feed generation: Feed Generation corresponds to generating the news feed from users/pages/groups that a user follows. Whenever there is any relevant post shared by any of the above, the feed generation system will perform these steps for feed generation:

  1. Retrieve IDs of all users/pages/groups that users follow and retrieve their latest and relevant posts.
  2. Posts are then ranked, and top K (20 or 50 posts) are represented in the user needs a feed.

Feed publishing: When the user loads all the current posts, they have to request and pull the further items from the server. In case of any new item, the server can notify the user about the post and them to the user or pull it from the server.

Low-Level Design

Let’s look into the detailed individual component design and discuss them.

Feed Generation

Feed Generation Service generates the newsfeed by fetching the most recent posts from all the people that the user follows. However, there could be various issues with the feed generation service. It could be very slow for users who have lots of friends, and whenever he reloads the feed, it would be very time-consuming as the service needs to do lots of preprocessing with all the posts. Similarly, for users with many followers, the feed generation and publishing would create a heavy load. Hence, we can pre-generate the timeline of such users in advance and store it in memory.

There can be servers that are devoted to continuously generating users’ newsfeeds and storing them in memory. So, whenever any request for the feed comes, the system could fetch the pre-generated data and hence increase the service’s responsiveness and reduce the latency.

Similarly, there could be lots of users that rarely do log in. To offer better services, we can also use an LRU based cache instead of keeping the pre-generated feed for each user’s memory. Such LRU based cache removes less frequently accessed Ids that haven’t been accessed for so long.

Feed publishing

Feed publishing deals with the process of pushing a post to all the followers. There are two ways of publishing a post to the user: Pull model and Push Model. Lets’ discuss these two feed publishing methods in greater detail.

Pull Method: In this method, the client has to pull the server’s feed data whenever they need it. Here all the recent data is kept in memory, available to the user whenever he needs it. There are some shortcomings to this approach:

  1. It might be possible that the user won’t see the recent data until he issues a pull request to the server.
  2. Sometimes, the user keeps sending the pull request and, in return, is unable to receive any new feed if there is no new data available.

Push Method: In this method, whenever a user publishes any post, it can immediately get available to all the followers. Hence this method reduces the number of read operations as we no longer need to fetch the server’s data every time. However, if a particular user has many followers, it would create an unnecessary load for the server to send updates to all the followers.

Feed Ranking

The Feed Ranking algorithm occupies an important position in the News Feed service. The feed ranking algorithm considers a large number of factors to rank the posts for the user. Previously, Facebook uses the EdgeRank algorithm to rank the post for the user, but currently, it is relied on Machine Learning based algorithms as ML algorithms a wide range of factors to generate a more personalized feed ranking system.

The ranking algorithm assigns different weights to different events such as likes, comments, shares, time of update, etc. It calculates scores based on these features and considers the relationship between the user and the post creator. This ranking system highly relied on its previous data and continuously kept evolving to increase user retention with the top K posts.

Optimization

Facebook News Feed is a read-heavy service, as there is a large number of new posts appear to users, which increases the read-load a lot. Hence there is a need to distribute the data across multiple servers to ensure real-time services with minimum latency. There is a need to shard the post and feed database to ensure read and write happen efficiently. We can conveniently partition the feed data using UserId and store all the data of a particular user on a single server using some hashing method for mapping the request. Hence, by sharding the database, the efficiency of the service increases and which directly enhance the user retention time, i.e., more profit for the company :)

Conclusion

Social Networking platforms are growing and developing day by day. With the advent of new technologies such as Machine Learning and Artificial Intelligence, social media are becoming more and more customized, which are directly used to capture the individual behavior and accordingly used to show relevant ads and hence proved very helpful in enhancing the profitability of the companies. Facebook News Feed is also one such social platform that has now become more user-friendly day by day. 

In this blog, we tried to show how the Facebook News Feed service was designed. This approach can be easily extended to design any customized News Feed service for any other system. Hope you liked it. Please share your views in the comments.

Our Weekly Newsletter

Subscribe to get well-designed content on data structures and algorithms, machine learning, system design, oops, and mathematics. enjoy learning!

We Welcome Doubts and Feedback!

More Content From EnjoyAlgorithms