define

Cosine similarity, also known as cosine similarity, is to evaluate the similarity of two vectors by calculating the cosine of their included Angle. Cosine similarity draws vectors in terms of coordinate values into vector Spaces, such as the most common two-dimensional space. Cosine similarity measures the similarity between two vectors by measuring the cosine of the Angle between them. A vector has a direction. If the Angle between two vectors is 0 degrees, the cosine is 1, and the cosine of any other Angle is less than 1, and its minimum value is -1. The cosine of the Angle between the two vectors determines whether they point roughly in the same direction. When two vectors have exactly the same orientation, the cosine similarity value is 1. When the Angle between the two vectors is 90°, the cosine similarity value is 0. Cosine similarity is -1 when two vectors are pointing in opposite directions. It doesn't depend on the length of the vector, it just depends on the direction the vector is pointing in. So there are scenarios where cosine similarity works, and there are scenarios where cosine similarity doesn't. Cosine similarity is usually used in positive Spaces, so the value given is between -1 and 1.Copy the code

A formula to calculate

The nature of the

The range of cosine value is between [-1,1]. The closer the value is to 1, the closer the direction of the two vectors is. The closer they go to -1, the more they go in opposite directions; It's close to zero, which means the two vectors are nearly orthogonal.Copy the code

Application scenarios

<1> For example, when apps like Momo and Maimai make group recommendation, we can use 'cosine similarity' to calculate the similarity of two groups. For example, group A has Three people: Zhang SAN, Li Si and Wang Wu. For example, Group B has sun Wukong, Zhu Bajie, Sand Monk and Zhang SAN. Then we calculate the similarity of the two groups. Add the two groups of people, and then subtract the weight to get a 6-dimensional space. Then the vector of group A is: Zhang SAN, Li Si, Wang Wu, Sun Wukong, Zhu Bajiesha and Monk 1 1 1 0 0 0. The vector of group B is: Then calculate the cosine similarity of the two vectors. If the cosine similarity is very high, if a new user joins group A in the future, we will recommend group B to this new user. <2> Users of Kuaishou Douyin and other apps recommend user military, automobile, funny, education, clothing...... 1 0.1 0.5 0.02 0.03 0.12...... 2 0.11 0.45 0.03 0.025 0.115...... 3, 4,... . When calculating the similarity of two users, it is not suitable to use cosine similarity, because the size needs to be considered when calculating the similarity of two vectors, so it is more appropriate to use Euclidean distance to calculate the similarity. <3> Text emotion analysis supervised learning machine learning algorithm segmented words from training data, and then constructed word vector. Each word in the vector has its own weight. New comments are segmented and word vectors are also constructed.Copy the code