Author: Parul Pandey

takeaway

Recommended system pipeline construction process and overall architecture description.

Too little choice is bad, but too much choice is also bad

Have you heard of the famous jam experiment? In 2000, Sheena Iyengar and Mark Lepper, psychologists at Columbia and Stanford universities, presented a study based on their field experiments. On a typical day, shoppers shopping at the upscale grocery store at their local food market will see a tasting booth featuring 24 kinds of jam. The other day, the same stall showed just six kinds of jam. The experiment was conducted to determine which stall would sell more, and it was assumed that a wider range of jams would attract more customers and thus more business. However, a strange phenomenon was observed. Although counters with 24 jams attracted more interest, they had a significantly lower conversion rate (about 10 times lower) than counters with only six jams.

What the hell happened? While a lot of choice may seem appealing, this overload of choice can sometimes confuse and hinder customers. So even though millions of items are available in online stores, without good recommendation systems, these choices may do more harm than good. \


Now let’s take a closer look at its architecture and the various terms associated with recommendation systems.

Terminology & Architecture

Let’s look at some important terms related to recommendation systems.

Articles/documents

These are entities recommended by the system, like movies on Netflix, videos on Youtube and songs on Spotify.

Query/Context

The system uses information to recommend the above items, and this information constitutes a query. A query can also be a combination of the following:

  • User information, which may include user ids or items with which the user previously interacted.
  • Some additional context, such as the user’s device, the user’s location, etc.

The embedded

Embedding is a method to represent classification features as continuous value features. In other words, an embedding is a transformation from a higher-dimensional vector to a lower-dimensional space called an embedding space. In this case, the query or item to be recommended must be mapped to the embedded space. Many recommendation systems rely on learning an appropriate query and embedded representation of items.


The overall architecture

The general architecture of a recommender system consists of the following three main parts:

1. Candidate generation

This is the first stage of a recommendation system that takes events as input from the user’s past activities and retrieves a small number (hundreds) of videos from a large corpus. There are two common approaches to candidate generation:

  • Content-based filtering ****

Content-based filtering involves recommending items based on the attributes of the item itself. The system recommends content that is similar to what the user has liked in the past.

  • Collaborative filtering ****

Collaborative filtering relies on user-item interactions, on the notion that similar users like similar things, such as customers who buy this item also buy this.

Grade 2.

This constitutes the second stage, in which another model further ranks and scores the candidates, usually on a scale of 10. In the case of Youtube, for example, the ranking network accomplishes this task by assigning points to each video based on a desired objective function using a rich set of characteristics that describe the video and the user. The videos with the highest scores are then presented to the user in order of their score.

A rearrangement of 3.

In the third stage, the system considers additional constraints to ensure diversity, freshness, and fairness. For example, the system will remove content that the user clearly didn’t like before and take into account all new content on the site.

The overall structure of a typical recommendation system \


Similarity measure

How can you tell if one object is similar to another? It turns out that both content-based filtering and collaborative filtering technologies use some kind of similarity measure. Let’s look at two such metrics.

Consider two movies, movie1 and movie2, which belong to two different types. Let’s draw the movie on a 2D graph, assigning 0 if the movie does not belong to a type, and 1 if the movie belongs to a type.

Here, movie 1(1,1) belongs to both type 1 and type 2, while movie 2 belongs to type 2 only (1,0). These positions can be thought of as vectors, and the angles between these vectors illustrate the similarity between them. \

Cosine similarity

It is a cosine of the Angle between two vectors, such as similarity(movie1,movie2) = cos(movie1,movie2) = 45 or about 0.7. A cosine similarity of 1 indicates the highest similarity, while a cosine similarity of 0 indicates no similarity.

Inner product

The dot product of two vectors is cosine of Angle times norm I product. similarity(movie1,movie2) = ||movie1|| ||movie 2|| cos(movie1,movie2).


Recommended system Pipeline

A typical recommender pipeline consists of the following five phases:

A typical recommended system pipeline \

Suppose we are building a movie recommendation system. The system does not know the prior knowledge of the user or the movie, but only the user’s interaction with the movie through rating the movie. Here is a Dataframe, which consists of the movie ID, the user ID, and the rating of the movie.

Movie rating Dataframe\

Since we only have ratings and no other information, we will use collaborative filtering for our recommendation system.

1. The preprocessing

  • Utility matrix transformation ****

We need to first convert the movie rating data into a user-item matrix, also known as a utility matrix.

Each cell of the matrix is populated by the user’s rating of the movie. This matrix is usually represented as a SCIPY sparse matrix because many of the cells are empty because no rating is given to a particular movie. If the data is sparse, then collaborative filtering does not work well, so we need to calculate the sparsity of the matrix. \

If the sparsity value is about 0.5 or more, collaborative filtering may not be the best solution. Another point to note here is that empty cells actually represent new users and new movies. Therefore, if we have a high percentage of new users, then we might consider using some other recommended methods such as content-based filtering or hybrid filtering. \

  • The normalized * * * *

There will always be a few users who are too positive (usually 4 or 5) or too negative (1 or 2 for every movie). This can be achieved by adopting average normalization.

2. Model training

After data preprocessing, we need to start the model building process. Matrix factorization is a commonly used collaborative filtering technique, although there are other methods, such as neighborhood methods. Here are the steps involved:

  • Decompose user-item matrix and get two latent factor matrices, user-factor matrix and item – factor matrix. * * * *

User ratings are a feature of movies made from human beings. These features are directly observable, and we think they’re important. However, there are certain features that cannot be observed directly, but are also important in rating forecasting. These hidden features are called latent features.

Latent characteristics can be considered fundamental characteristics of the interaction between users and projects. Essentially, we don’t know what each potential feature represents, but we can assume that one might represent a user’s preference for comedy movies, another might represent a user’s preference for animated movies, and so on. \

  • The defect rating is predicted by the inner product of the two latent matrices. * * * *

The latent factors are denoted here by K. This reconstruction matrix fills the empty cells in the original user-item matrix, so the unknown ratings are now known. \

But how do we implement the matrix factorization shown above? As it turns out, there are many ways to do this, using one of the following:

  • Alternate least squares (ALS)****
  • Stochastic gradient descent (SGD)****
  • Singular value decomposition (SVD)****

3. Hyperparameter optimization

Before adjusting the parameters, we need to pick an evaluation indicator. A popular metric for evaluating recommendations is Precision at K, which looks at the top K recommendations and calculates the percentage of those recommendations that are actually relevant to the user.

Therefore, our goal is to find the parameters that have the best accuracy on K or any other evaluation index that needs to be optimized. Once these parameters are found, we can retrain our model to get our predictive ratings, and we can use these results to generate our recommendations.

4. Post-processing

We can then rank all the predicted ratings and get the top N recommendations for the user. We also want to exclude or filter out items that the user has previously interacted with. As far as movies are concerned, it doesn’t make sense to recommend movies that users have seen or disliked before.

5. Evaluate

We’ve talked about this before, now let’s talk about it in more detail. The best way to evaluate any recommendation system is to test it on an actual system. Techniques like A/B testing are best because you can get actual feedback from real users. However, if this is not possible, then we have to resort to some offline evaluation.

In traditional machine learning, we split the raw data set to create a training set and a validation set. However, this does not work for the recommendation model, because if we train all the data on a separate user group and validate the data on another user group, the model will not work. So for the recommendation system, what we’re actually doing is randomly masking some known ratings in the matrix. We then used machine learning to predict these masked ratings, and then compared the predicted ratings to the actual ratings.

Offline evaluation recommendation system \

Earlier we discussed accuracy as an evaluation metric. Here are a few others you can use.


Python library

There are many Python libraries created specifically for recommendation purposes. Here are some of the most popular:

  • Surprise[1] : A Python SciKit build and analyze recommendation system.
  • Implicit[2] : Fast Python collaborative filtering for Implicit data sets.
  • LightFM[3]: Python implements many popular recommendation algorithms for implicit and explicit feedback.
  • Pyspark. Mlib. Recommendation [4] : Apache Spark machine learning on the API.

conclusion

In this article, we discuss the importance of recommendation systems in narrowing choices. We also introduced the design and construction process of the recommendation system. Python actually simplifies this process by providing access to a large number of dedicated libraries. Try building your own personalized recommendation engine with one.

The resources

[1]

Surprisesurpriselib.com/

[2]

Implicit: Implicit. Readthedocs. IO/en/latest/q…

[3]

LightFM: lyst. Making. IO/LightFM/doc…

[4]

Pyspark. Mlib. Recommendation: spark.apache.org/docs/2.1.1/…

— the END —

English text: towardsdatascience.com/recommendat…

“`php

Highlights of past For beginners entry route of artificial intelligence and data download AI based machine learning online manual deep learning online manual download update (PDF to 25 sets) note: WeChat group or qq group to join this site, please reply “add group” to get a sale standing knowledge star coupons, please reply “planet” knowledge like articles, point in watching

Copy the code