1. Introduction to the recommendation algorithm

Recommendation algorithms are so ubiquitous that you think you’re doing what you want, but you’re actually doing what someone else wants you to do. We already have no escape from recommendation algorithms. In this paper, the evolution of the recommendation algorithm is roughly sorted out, not to achieve the overall and integrity of the algorithm technology, but more attention to the development and optimization process of the classical algorithm, trying to get inspiration from the algorithm innovation, in order to extract the general algorithm innovation ideas.

2. Traditional recommendation algorithm

The simplest recommendation method is to rank and recommend products by popularity. The underlying assumption is that new customers will most likely like what the public likes. Even now, there are many situations where we subconsciously make choices in this way. However, obviously, the drawback of this method is that it is not personalized, and it cannot make accurate recommendation based on preferences.

Personalized recommendation makes use of users’ historical data to predict users’ preferences, which is also a must in the era of big data. For subsequent calculations, we often think of the user and the product as a binary network, with the edge being the history of the product purchased by the user. What we want to do is to use this dichotomous network to predict which item (or rank) user III is most likely to buy next, which is essentially a dichotomous network edge prediction problem.

2.1 Collaborative Filtering

Collaborative filtering is the most classical recommendation algorithm whose idea is to consider both similarity and purchase relationship. There are also two types of collaborative filtering, namely user-based algorithm and commodity based algorithm.

  • User-based collaborative filtering algorithms are based on the assumption that users will buy products purchased by other users who are similar to them. User III’s score for item α\alphaα is calculated as follows:

f ~ Alpha. i = j = 1 N s i j a j Alpha. \tilde {f}^{i}_{\alpha} = \sum^{N}_{j=1} s_{ij}a_{j\alpha}

Where N is the number of users, sijs_{ij}sij represents the similarity between user III and user JJJ, and ajαa_{j\alpha}ajα represents whether user III has purchased goods α\alphaα, which is either 0 or 1 and is essentially a term in the adjacency matrix. The score of the above formula finally summarizes the following information: Other users JJJ that are similar to user III are summarized, and users with greater similarity have greater contribution weight; Then we screened out those who also bought the item alpha \alphaα. Finally, the similarity of these people was summed as the final evaluation score.

  • Collaborative item-based filtering algorithms assume that users will buy products similar to those they have already purchased. User III’s score for item α\alphaα is calculated as follows:

f ~ Alpha. i = Beta. = 1 M s Alpha. Beta. a i Beta. \tilde {f}^{i}_{\alpha} = \sum^{M}_{\beta=1} s_{\alpha \beta}a_{i\beta}

Where M is the quantity of goods, sαβs_{\alpha \beta}sαβ represents the similarity between goods α\alphaα and goods β\betaβ, and other items have the same meanings as above. The score of the above formula finally sums up the following information: sum up other goods β\betaβ that are similar to goods α\alphaα, and the greater the similarity of goods, the greater the contribution weight; Secondly, we screened out whether target user III had purchased goods β\betaβ. Finally, the sum of the similarity of these goods was taken as the final evaluation score.

The core of collaborative filtering is the definition of similarity. There are many definitions of similarity, such as the number of common neighbors, cosine similarity, Jaccard and so on.

2.2 Matrix Factorization

Matrix decomposition is a widely used mathematical operation, usually used for information compression, feature extraction and so on. It can also be used to make recommendations between users and products.

If there are N users and M goods in the network, then the size of the T-matrix is N by M. The matrix solution decomposes this matrix into two matrices, W and V. Their dimensions are N×K and K×M, respectively, which describe the dimensions of user preference and product characteristics (K≪N, K≪M).


T material W T T \approx WT

In order to get W and V, it can be obtained by minimizing the Frobenius distance of T and WV. Related algorithms such as Singular Value Decomposition have been well developed. After obtaining W and V, we calculate T~=WV\tilde {T} =WVT~=WV. The larger the corresponding element value in T~\tilde TT~ matrix is, the more likely it is to exist in the future.


R R ~ = i Alpha. ( r i Alpha. r ~ i Alpha. ) 2 ||R – \tilde R|| = \sqrt{\sum_{i \alpha} (r_{i_\alpha} – \tilde r_{i \alpha})^2}

It can be used to predict the weight of the edge.

3. Web-based recommendation algorithm

After the rise of network science, the recommendation algorithm based on network began to develop steadily and is basically in a stable state now. In 2010, a paper published in PNAS proposed two typical network recommendation algorithm frameworks (PNAS 107, 4511 (2010)), which are based on Mass diffusion and Heat conduction principle respectively. The subsequent algorithm improvement is basically based on these two ideas to expand.

3.1 Material diffusion

The material diffusion algorithm calculates user III’s rating of other goods α\alphaα by the following steps:

  1. Initialize, and mark all goods purchased by user III as 1
  2. Product →\to→ user, the score of the product end will be evenly distributed to all users who have purchased the product
  3. User →\to→ commodity, the user’s score is divided into the commodities purchased by the corresponding user

The equipartition process is similar to the material diffusion process, the extensive quantities can be directly added, and the total “mass” is conserved. The final score of each product is the preference degree of user III for the product list, and the final recommendation result is the product that is filtered out.

3.2 heat conduction

The heat conduction algorithm calculates user III’s rating of other goods α\alphaα as follows:

  1. Initialize, marking all items purchased by user III as 1 (this step makes no difference)
  2. Commodity →\to→ user, the score of the commodity end is delivered equally to all users who have purchased the commodity. The user then averages the score obtained from different sources as the final score of this stage
  3. User →\to→ commodity, the user’s score is equivalent to the commodities purchased by the corresponding user. The commodity end is the same as above, and the mean value is removed as the final score at this stage

The equivalent transfer process is similar to the heat transfer process and is strongly homogenized. The final score of each product is the preference degree of user III for the product list, and the final recommendation result is the product that is filtered out.

3.3 Differences between material diffusion and heat conduction and their coupling methods

The recommendation characteristics of the two methods are very different. Under normal circumstances, the accuracy of the material diffusion algorithm is higher, but due to the step of adding scores, it often gets higher scores for generous nodes, which often means hot commodities. On the contrary, the accuracy of heat conduction algorithm may be lower than that of material diffusion algorithm in many cases, but due to the “mean” operation, nodes with lower degree are more likely to get higher scores, which means that unpopular products will get more recommendations at this time.

In practice, we often combine the two methods to achieve the best recommendation effect. Linear coupling and nonlinear coupling are commonly used.

Linear coupling is a common and universal coupling method, that is, linear combination of the results of multiple methods:

Nonlinear coupling often needs to be involved in detail according to specific problems, here we can adopt the method of introducing exponential term to adjust the effect.

Each of the graphs shown above represents a data setNonlinear coupling methodThe change of different evaluation indexes with different coupling parameters. Each column represents a kind of data set (for users and for movies, music, and bookmarks, respectively). Each line represents a different evaluation indicator. Let’s establish the concept for a moment: the first two indicators (the first two lines) describe changes in recommendation accuracy, and the last two indicators (the last two lines) describe changes in diversity. The specific indicators will be explained in detail in the next chapter.

4. Recommend evaluation indexes for effect

Before giving the follow-up effect improvement, it is necessary to introduce the commonly used evaluation indicators of recommendation effect. The recommendation effect is usually evaluated from two aspects of recommendation accuracy and recommendation diversity. Of the five indicators listed below, the first two are for accuracy and the last three are for diversity.

4.1 Ranking Score


R = 1 E p i Alpha. E P R i Alpha. R = \frac{1}{|E^p|}\sum_{i\alpha \in E^P}R_{i\alpha}

Ranking score is a classic evaluation index used to evaluate the effectiveness of recommendations. In the above formula, RiαR_{I \alpha}Riα represents the ranking of product α\alphaα for user III’s recommendation (e.g. 1/10 means the first product out of 10). EpE^pEp is the set size as the normalization coefficient. The higher the RRR, the better the recommendation effect.

4.2 Precision


P i ( L ) = d i ( L ) L P_i(L) = \frac{d_i(L)}{L}

Accuracy is another common indicator in practical recommendation. It is more in line with the actual scene to score and evaluate the accuracy of the first LLL products recommended by users. Di (L)d_i(L) What does di(L) represent??

But Pi(L)P_i(L)Pi(L) is a cruder measure than Ranking Score. Usually these two metrics are used together.

4.3 Intel – similarity

Inter, as opposed to Intra mentioned below, stands for the difference in recommendation between two users (the greater the difference, the better).


H i j ( L ) = 1 C i j ( L ) L H_{ij}(L) = 1-\frac{C_{ij}(L)}{L}

Cij(L)C_{iJ}(L)Cij(L) represents the number of product combinations recommended to user III and user JJJ, and LLL is also the length of the recommendation list.

4.4 Intra – similarity

Intra measures the difference of a user’s recommendation results, and the greater the difference, the better. (The consensus below actually measures similarity, and the smaller the similarity, the better to achieve the same effect)


I i ( L ) = 1 L ( L 1 ) Alpha. indicates Beta. s Alpha. Beta. o I_i(L) = \frac{1}{L(L-1)}\sum_{\alpha \neq \beta}s^o_{\alpha \beta}

4.5 Novelty

This index measures the recommendation of “novelty”, that is, avoid recommending popular products. The lower the value, the stronger the novelty.


N i ( L ) = 1 L Alpha. O i K Alpha. N_i(L) = \frac{1}{L}\sum_{\alpha \in O^i}K_\alpha

KαK_\alphaKα is the degree of good α\alphaα, which is the number of people who buy the good.

5. Method improvement

Based on the above methods, we can also adopt various measures to optimize its effect.

The simplest method is to optimize the initial configuration. The original method is to assign the same weight to all purchased goods. However, we can assign different weight to different goods, such as giving more weight to products with lower degree. The updated method can be represented as follows


f j i = a i j K Beta. ( o j ) f^i_{j} = a_{ij}K^{\beta}(o^j)

Aija_ {ij}aij is the weight, K is the degree value of goods. β\betaβ is an adjustable parameter, when β>0\beta>0β>0, the above formula represents the amplification of popular recommendations; When β<0\beta <0 β<0, that is, amplification is recommended. In the literature [EPL 81, 58004 (2008)], β\betaβ can obtain the optimal recommendation effect at -0.8.

The basic idea of similarity-preferential diffusion method is to enlarge the weight of similar users. That is, assigning a score greater than 1 θ\thetaθ to mass diffusion or heat conduction causes similar users to be assigned higher scores. [EPL 105, 58002 (2014)]

In addition, there are some other optimization methods, such as personalized Parameters [Physica A392, 3417 (2013)] to optimize the recommendation effect. Virtual Links [EPL 100, 58005 (2012)] by adding virtual edge optimization score diffusion effect, etc.

6. Discussion and summary

Recommendation algorithm is a product born with the rise of the Internet. With the continuous development of big data and the complexity of technical scenes, the design of recommendation algorithm also needs to continue to develop further according to the actual needs. On the one hand, it can expand its application scope, such as applying it to weighted networks, multi-layer networks, time-dependent networks and so on. On the other hand, other information besides network topology can be better used, such as user information or preferences, to obtain better recommendation effect and better solve the problem of cold start.