The popularity Bias in dynamic recommendation

The original link

Introduction to the

Popularity Bias is a long-standing problem in recommendation systems: popular items are over-recommended, while less popular items that users may be interested in are under-recommended. This bias has adverse effects on both consumers and businesses, and many studies are devoted to solving this bias.

Most existing studies tend to apply the popularity bias to a static setting, that is, training the recommendation model through offline data sets and then conducting a recommendation round to analyze the popularity bias. Although these studies highlight the prevalence of popularity bias, the paper argues that bias can be dynamic, including factors influencing popularity bias and its evolution, as well as the effectiveness of approaches to mitigate such bias under realistic assumptions of system evolution. Therefore, this paper proposes a research framework of popularity bias in dynamic recommendation.

Dynamic recommendation can be viewed as a closed loop, as shown in Figure 1. Users interact with the system through a series of operations (such as click, view, score); Then, the user feedback data is used to train the recommendation model. The trained model is used to recommend new products to users; Then, continue to start a new cycle.

While many factors may influence this dynamic recommendation process, the authors identified four key factors that may influence the popularity bias and its evolution:

Inherent audience size imbalance: users may prefer some goods (even in the unbiased random recommendation mode), which means that a few goods may have a very large user size, while most goods have a small user size.
Model bias: The recommendation model itself may magnify any imbalance in the data set it uses for training
Position bias: Based on recommendations made by the model, items that rank high are more likely to be viewed by users
Closed feedback loop: Due to repeated cycles, feedback data collected from the current recommendation model will influence the training of future models, potentially accumulating biases

In this paper, popularity-opportunity bias is introduced, which is a formal expression of popularity bias based on the concept of equal opportunity. The traditional popularity bias is based on statistical averages, which compare the number of recommendations for popular and unpopular items. The population-opportunity bias measures whether popular and unpopular items are clicked (or some other metric) in proportion to their true audience size. By comparing engagement, rather than just recommendation system counts, the population-opportunity bias can be directly linked to user satisfaction and the economic benefits to project providers.

This paper has three main contributions:

Firstly, a comprehensive empirical study on the popularity bias in dynamic recommendation is conducted through simulation experiments, and the evolution of the popularity bias in dynamic recommendation is investigated, as well as the influence of these four factors on the popularity bias. This paper finds that the intrinsic audience size imbalance and model bias are the main drivers of popularity bias, while location bias and closed-loop feedback further exacerbate the popularity bias. In addition, two different negative sampling strategies are compared to illustrate that the effect of popularity bias can be mitigated by careful design of negative sampling.
Secondly, we discuss how to eliminate the popularity bias in dynamic recommendation. This paper shows how to adapt the proposed de-bias method in static environment to dynamic scenarios, and further proposes a model-independent False Positive Correction (FPC) de-bias method, which can be combined with other de-bias methods to further improve performance.
Finally, a large number of experiments show the effectiveness of the proposed dynamic de-bias method compared with the static strategy, and also illustrate the significant performance improvement brought by the FPC method.

Problem formalization

Dynamic recommendation formalization

Suppose there is an online platform that provides recommendations, such as movie, job or song recommendations, the dynamic recommendation process is as follows:

Every time a user visits the platform, the platform will give a ranking list of products based on the personalized recommendation model learned from the user’s historical feedback
Periodically update the personalized recommendation model with newly collected user feedback. In order to guide a new user (cold start), the platform will use some non-personalized methods to understand the user’s preferences, such as randomly presenting items to the user and collecting feedback.

To sum up, assume a set of users in the system U={1,2,.. , N} U = \ {1, 2,… N \} U = {1, 2,… ,N}, and a group of commodities I={1,2… , M} I = \ {1, 2,… , M \} I = {1, 2,… , M}. Each user has a group of products they like (a subset of III). The total number of users who like goods III can be defined as the audience size of III, which is represented by AiA_iAi.

In the boostrap step, the system randomly recommends KKK items to guide users to collect DDD of user-item clicks. Based on the initial data DDD, the first recommended model ψ \Psi Psi can be obtained, for example, a matrix factorization model (MF) can be trained. As users enter the system one by one, the system uses the latest model to recommend 𝐾𝐾K ranked products and collect new user-item clicks. After each LLL user visits, the system will retrain the model based on all clicks collected, and all processes are shown in Algorithm 1.

Formalize the popularity bias

In this paper, the popularity-opportunity bias is introduced to formalize the popularity bias. In order to correct the bias in each iteration TTT during the dynamic recommendation process, the true positive rate of each product needs to be calculated first. Assuming that commodity III gets CitC^t_iCit in the TTT iteration cycle, the true positive rate of commodity III is TPRi=Cit/AiTPR_i=C^ t_I /A_iTPRi=Cit/Ai. Then, the Gini Coefficient is used to measure the inequality of the true positive rate related to the popularity of goods in iterative TTT:

Gini_t=\frac{\sum_{i\in I}(2i-M-1)TPR_i}{M\sum_{i\in I}TPR_i}

Goods are arranged from 1 to M in A non-descending order of audience size, that is, Ai≤A(I +1)A_i \leq A_{(I +1)}Ai≤A(I +1). Ginit∈[−1,1]Gini_t\in [-1,1]Ginit∈[−1,1] is used to quantify the popularity bias. The smaller the absolute value of Ginit index is, the lower the bias is. ∣ Ginit ∣ > 0 | Gini_t > 0 | ∣ Ginit ∣ > 0 true positive rate and commodity audience scale were positively correlated, ∣ Ginit < 0 ∣ | | Gini_t < 0 ∣ Ginit < 0 ∣ said negative correlation (i.e., reverse popularity bias).

Factors influencing the popularity bias

As shown in Figure 1, there are four main concerns:

Inherent audience scale imbalance

Different products have different audience sizes, and this imbalance can lead to a popularity bias. For example, the audience size of goods usually follows the long tail distribution, that is, a small number of goods have a very large audience size and a large number of goods have a small audience size. This leads to an inherent imbalance in engagement data (clicks, etc.) even in the case of completely random recommendations.

Model bias

Model recommended in the user may enjoy the same degree of two commodities, tend to the training data in fewer clicks ranked higher than many goods click goods, this is based on a defect in the collaborative filtering algorithms are widespread, if the training data is not balanced, will directly cause the popularity of prejudice.

Location bias

In ranking scenarios, it is a common phenomenon that items in higher ranking positions are viewed more often than items in lower ranking positions. If there are inherent audience size imbalance and model bias, the location bias will gradually increase the popularity bias.

A closed-loop feedback

Future models are trained based on previous models and click data from the recommendation system. In this way, the popularity bias generated in the past accumulates and more bias is generated in subsequent models as the feedback loop continues.

Experience in learning

This section conducts an empirical study on how the popularity bias evolves in dynamic recommendation, including the influence of the four factors discussed previously on the bias. In addition, two different negative sampling strategies are compared to illustrate that the influence of the popularity bias can be alleviated through the careful design of negative sampling.

Set up the

The experiment faced two key challenges:

How do I get the complete basic facts of user-item relevance?
How to simulate user click behavior to give suggestions?

For the first challenge, semi-synthetic data was generated from real-world user-goods interaction datasets, using the ML1M and Ciao datasets, randomly reserving 1000 users from each dataset. Then, matrix factorization (MF) model is used to complete the original data set to provide ground truth user-commodity correlation. By modifying the data generation process, four variables with different levels of intrinsic audience size imbalance were also generated for each basic data set to study the effect of intrinsic audience size imbalance.

Then, the simulation experiment is run based on the process of algorithm 1, where K=20,T=40000,L=50K=20,T=40000,L=50K=20,T=40000,L=50.

For the second challenge, user click behavior was modeled based on location bias δk=1/log2(1+ K)\ delta_K =1/log_2(1+ K)δk=1/log2(1+ K) to determine whether user UUu would click to view goods III at location KKK.

In addition, a special operation is that the negative sample comes from each user’s recommended but unclicked items, which can achieve higher recommendation utility and lower popularity bias compared with the traditional negative sampling strategy (taking a negative sample from all unclicked items).

Ways to mitigate prejudice

Through empirical research, this paper proves that model bias and inherent audience size imbalance are the two most important factors affecting popularity recommendation, and model bias is directly influenced by operators (researchers), while the other three factors exist in the internal aspects of the system. Through the experiment, it is found that if the model bias is eliminated, the popularity bias is still very low even with the influence of the other three factors. Therefore, in this section, we will focus on how to reduce the popularity bias in dynamic recommendation by reducing the model bias.

Mitigate model bias dynamically

This section focuses on how to apply the existing static de-bias method to dynamic scenes, and its core idea is to gradually increase the weight of de-bias intensity in the existing model.

Most of the existing work has been to reduce popularity bias in static scenarios by reducing model bias. For example, one approach, called Scale, relieves bias by rescaling the output of the recommendation model. Specifically, for a user-item pair (u, I)(u, I)(u, I), the rescale score is:

\hat{r}^{scaled}_{u,i}=\hat{r}^{model}_{u,i}/(C_i)^\alpha

Where, r^u, iModel \hat{r}^{model}_{u, I}r^u, iModel is the prediction score output by the recommendation model, CiC_iCi is the number of clicks on goods in the training data set, α\alphaα is a hyperparameter used to control the intensity of debias, and the higher the value, the higher the intensity.

In static recommendation, α\alphaα is a constant, which is obviously not feasible in dynamic recommendation. Therefore, it is suggested in this paper to increase α\alphaα gradually from 0 and set the increment step as δ \Delta δ as the process of dynamic recommendation model goes on. In particular, alpha \alphaα is incremented δ \Delta δ with each iteration starting at 000.

Use false positive correction

In this paper, a model-agnostic False Positive Correctionmethod for debiasing is further proposed, which is short for FPC. The predicted score based on false positive signals is modified in a probabilistic manner. (Positive signals refer to click-through rates and false positives refer to products that are recommended but not clicked)

Suppose you want to predict users $u$ And the commodity $i$ Correlation between $\hat{r}_{u,i}$ , the prediction score obtained by the recommendation model is now available $\hat{r}^{model}_{u,i}$ . Assume that goods $i$ It has been recommended to users $u$ the $F$ For each recommendation, it is recorded as $\{k_1,… ,k_F\}$ , false positive signal is defined as $\{c_{k1}=0,… ,c_{kF}=0\}$ . Further defining users $u$ Like goods $i$ The probability of $\theta_{u,i}$ , $P(r_{u,i}=1)=\theta_{u,i}$ . Redefine the ranking position $k$ The probability of the product being viewed is $\delta_k$ , i.e., $P(e_k=1)=\delta_k$ . In summary, it can be calculated in the case of given false positive signal $u$ like $i$ Conditional probability of: