Collaborative Filtering Simplified: The Basic Science Behind Recommendation Systems | by Sajan Gutta | Jan, 2021


Sajan Gutta
Photo by Kari Shea on Unsplash

When making consumer decisions, it often seems that we are making conscious choices about the services we use and our preferred products. However, the companies competing for our business are constantly influencing our decisions in subtle ways. Companies often recommend specific products to increase the likelihood that we choose them over competitors and the mix of product options we are exposed to has become increasingly tailored to our personal preferences. This is based on the theory that someone is more likely to purchase and enjoy a product matching their preferences. Recommendation systems allow companies to increase user engagement, increase sales, and continuously adapt offerings to user’s preferences [1]. In practice, the recommended list of songs or products we see may seem simple. However, a sophisticated, intuitive engineering process takes place behind the scenes to generate this list.

Recommendation systems leverage data generated from user behavior to make inferences about user preferences. At Spotify the user data could be song choices, while Amazon often bases recommendations on a user’s viewed products, purchased products, and reviews of products. The specifics of how a recommendation system is implemented are determined by the use case but there are established general techniques for generating recommendations. The most notable and powerful technique is collaborative filtering, which we will now explore further.

Collaborative filtering uses a large set of data about user interactions to generate a set of recommendations. The idea behind collaborative filtering is that users with similar evaluations of certain items will enjoy the same things both now and in the future [2]. For example, assume User A and User B both enjoyed items X and Y. Based on this information, we can hypothesize that User A and B have similar preferences. Thus, if User B enjoyed item Z, we can recommend item Z to User A. This process of sourcing recommendations from similarities between users is why this technique is called “collaborative” filtering.

The interaction data powering a collaborative filtering system can be obtained from users via an explicit prompt or implicitly drawn from user behaviors. In the case of explicit data, a user provides clear and direct data regarding how much they enjoyed an item. This will usually be a “like” or a rating on a scale (see picture below) that the user was explicitly prompted to provide [5]. User preference data can also be gathered implicitly. This entails inferring whether a user prefers an item indirectly based on tracked behaviors. These behaviors can include which pages a user views, where they click, the time they spend looking at something, and more [3]. For example, if a user views an item for a long period of time, it often means that they like the item or they are interested in it.

An example of Amazon asking customers for reviews (screenshot directly from

Once interaction data has been collected, the immediate question becomes how can we turn this data into recommendations? One of the most popular methods for doing this is a powerful linear algebra technique called matrix factorization. Matrix factorization provides a concrete mathematical basis for applying collaborative filtering, as it allow us to transform interaction data into a model that determines if a user will like an item.

To understand matrix factorization we must understand how matrix multiplication works. A matrix is a set of numbers arranged in rows and columns, forming an array. When two matrices are multiplied together, the resulting matrix has the same number of rows as the first matrix, and the same number of columns as the second. For each element in the resulting matrix, we take the corresponding row in the first matrix and the corresponding column in the second. We then multiply the corresponding elements of the selected row and column and sum the results. The figure below shows an example of how this process works. The value 75 in the result matrix is in the first row, so we select the first row of the first matrix. The value 75 is also in the first column, so we select the first column of the second matrix. Corresponding values in the row and column are then multiplied and the results are added together, resulting in the value 75.

An example of matrix multiplication (image by author)

Multiplication is that it can be reversed through the process of factoring. For example, multiplying 3 and 4 results in the number 12. The number 12 can then be decomposed into the factors 3 and 4, since they multiply together to get 12. The same thing can be done with matrices. For most matrices, there exist two matrix factors which multiply together to give that same matrix, or one that is extremely close. In the figure above, the third matrix can be factored into the first two matrices. This is because multiplying those two matrices together results in the third matrix. This process of finding two factors for a given matrix is known as matrix factorization.

In a recommendation system, user interactions are stored in a large matrix. Consider the case of a movie recommendation system. On one axis of the matrix we can have the different users, while on the other axis are the different movies. Each individual value in the matrix corresponds to how much a user enjoyed that movie (either implicit or explicit, depending on the collected data). Then, we can take this matrix and apply matrix factorization. The resulting factors will be a matrix representing user preferences and a matrix representing the movies. Let’s see an example to gain more intuition about how this works.

This image is for the following Collaborative Filtering example. Blue numbers are data that was not collected. They represent predictions made by performing matrix factorization (Image by author).

Consider the situation in the image above with three users and three movies. The matrix outlined in black represents interaction data. Numbers in green and red were collected from observing user actions, while blue numbers were generated by performing matrix factorization. If a value is 1, the user liked the corresponding movie and if the value is 0, they did not.

The interaction matrix has been factored into embedding matrices for both users and items. The numbers in the two new matrices make it so that multiplying the two matrices together results in the interactions matrix. For example, to get the top left value of the interaction matrix we sum the products of corresponding elements of the first row in the user matrix, and the first column in the item embedding matrix. This value is 1*1 + 0*0 = 1, which matches the top left interaction value. Notice that the previously unknown blue values can now be derived by applying the same process. For example, the value for how much user User 2 will enjoy Batman can be calculated using the second row of the user matrix and the second column of the item matrix.

Examining the interaction values will show that our application of collaborative filtering has worked! Notice that users 1 and 2 have similar preferences since both liked Superman and disliked Frozen. Thus, since user 1 likes Batman, user 2 will probably like Batman as well, as confirmed by the blue generated value of 1 in the spot. Conversely, notice that user 3 has very different preferences from users 1 and 2. Thus, since both of those users liked Batman, user 3 probably will not like Batman as is confirmed by our blue generated value of 0 in that spot. This simple example demonstrates how matrix factorization determines if a user will like an item. This information can be turned into recommendations by recommending the items a user is most likely to enjoy. In practice, data sets will be larger and the calculations will be more complex. While the values for this example were hand-engineered, the calculations for a production-level recommendation system will be handled automatically using a library with built-in machine learning functions [4].

One thing we haven’t addressed is how these recommendations were made without defining any information about the movies. In our example, we know that Batman and Superman are both action movies while Frozen is a more family-oriented movie. However, we never defined the genres of the movies but somehow examining user behaviors still resulted in predictions that seemed to consider the genres of the movies. The ability to recommend items without having defined information about them is made possible by latent features. Latent features are features of items or user preferences that we haven’t explicitly defined. However, in finding a set of embedding matrix values that can explain our interactions data, the calculations used in matrix factorization unintentionally discover a set of relevant features. This is because the latent features influenced the user interactions in the first place. These features can be anything ranging from genre information to price. There is no way of explicitly knowing which features will take effect but we can typically hypothesize these based on the type of item we are recommending [5]. Latent features provide the true power of collaborative filtering as they bring order to a large set of data. This ultimately results in recommendations for the user that are likely to be accurate.

One final thing to address about collaborative filtering is its biggest downside, the cold start problem. You may have noticed in our previous example, that we were able to determine how much Users 2 and 3 would enjoy Batman, because we knew how much User 1 enjoyed it. However, if we didn’t know what User 1 thought about the movie, how would we have made predictions for the other 2 users? Of course, we could realize that Batman is an action movie and similar to Superman. This would mean that User 2 will probably enjoy it and User 3 will not. However, this defeats the purpose of using latent features in collaborative filtering. This problem of struggling to make predictions about items without interaction data is referred to as the cold start problem.

When a collaborative filtering system is first created, it is often ineffective due to a lack of information about user preferences. This hinders the performance of this type of recommendation system and can cause it to be ineffective in situations with a small user base or with too many items. Usually, this can be solved by storing some relevant explicit feature information about each movie (e.g. genre, rating, etc.) and matching this with a user’s past preferences. This sort of hybrid approach is often what ends up happening in the real world, due to the cold start problem.

Now that we’ve reviewed the basics, you can begin to explore some tools and create your own recommendation system. Many programming languages can be used for creating a recommendation system, but the most common is Python. Python has great tools such as pandas and NumPy that will allow you to transform your interaction data into a form that is ready for calculations. Tools such as TensorFlow and PyTorch have built-in functions to handle the powerful calculations necessary for collaborative filtering.

Recommendation systems have become extremely common in modern businesses. They allow companies to practice advanced micro marketing strategies by tailoring their offerings to a user’s preferences. The power of recommendation systems will only continue to grow, as they learn more from our actions about our personal preferences. With these tools and the concepts that we’ve discussed here, you’ll be off to a great start towards creating your own recommendation system and harnessing the power of technology in your own venture.

NOTE: This article was focused on covering the basics of collaborative filtering and providing a simplified example of how it works. You can expect some more advanced articles in the future, walking through the technical steps for building a functional recommendation system!

[1] C. Underwood, “Use Cases of Recommendation Systems in Business — Current Applications and Methods,” Emerj, 04-Mar-2020. [Online]. Available: [Accessed: 10-Oct-2020].

[2] V. Kurama, “A Simple Introduction to Collaborative Filtering,” Built In, 04-Sep-2019. [Online]. Available: [Accessed: 10-Oct-2020].

[3] S. Luo, “Intro to Recommender System: Collaborative Filtering,” Medium, 06-Feb-2019. [Online]. Available: [Accessed: 10-Oct-2020].

[4] “Collaborative Filtering | Recommendation Systems | Google Developers,” Google. [Online]. Available: [Accessed: 10-Oct-2020].

[5] P. Pandey, “Recommendation Systems in the Real world,” Medium, 25-May-2019. [Online]. Available: [Accessed: 10-Oct-2020].

Read More …


Write a comment