Matrix Factorization is one of the most commonly used collaborative filtering algorithms for recommendation systems. This set of algorithms used to be the mainstay of recommendation systems and are useful in many places even now. Here, we simply sort out the thinking from the most basic Matrix Factorization (MF) algorithm to its derived Probabilistic matrix Factorization (PMF). Specific solutions can be referred to the recommendation algorithm – based on matrix decomposition recommendation algorithm and probability matrix decomposition of these two blogs.

Matrix Factorization

In short, MF can be considered as a solution to the problem under the guidance of Latent Factor Model, which is actually a branch of collaborative filtering method. The author introduced collaborative filtering in a previous article.

In the collaborative filtering method, we mention the existence of user-item-preference matrix. The main idea of Latent Factor model is that there are some invisible hidden variables representing user preferences, user preferences can be fully represented by these hidden variables, these hidden variables can also determine the user’s preference for item. Expressed in user-item-preference matrix, that is, we can decompose the User-item-preference matrix into the product of two matrices.

Let’s say we haveA user,The item,The user-item-preference matrixTo say,Represents the matrix of user against Latent factor,Represents the matrix of item against Latent factor. Under the assumption of latent Factor model, the matrix decomposition algorithm can be expressed as. Specifically, forEach of the, there are, i.e.,.

When we solve this problem, the loss function is zero

Take the negative gradient of the loss function, then

Using these two negative gradient formulas, we can use the gradient descent method for matrix decomposition.

Probabilistic Matrix Factorization

Above is the basic MF algorithm. PMF adds a Gaussian probability function on the basis of MF. The purpose of adding this function is to solve the problem that users with few scores are difficult to obtain accurate results, that is, to solve the user cold start problem. It turns the above formula into a conditional distribution based on an observation matrix: