preface

Recently, I found taobao’s image search function when shopping taobao. It may be that I am too Low. This technical point has been implemented for a long time. Think you can realize this function, I think that at first, on the first two images from the top left corner pixels has been compared to the lower right corner of the last one pixel, and record when comparing their similarities, may be I was too naive (mainly knowledge limited the imagination), do have a lot of problems, For example, two images are not the same size, the location of the core elements is different, etc. Finally, I had to rely on the network to find an algorithm called Average Hash algorithm. Next, I elaborated its basic ideas and applicable scenarios.

The basic idea of mean hashing

1, reduce the size:

The quickest way to get rid of the high frequency and detail in an image is to shrink it down to 8×8, 64 pixels in all. Don’t keep the aspect ratio, just make it an 8-by-8 square. In this way, you can compare pictures of any size, eliminating the differences caused by different sizes and proportions.

2. Simplified colors:

Convert small 8-by-8 images into grayscale images.

3. Calculate the average value:

Calculate the gray mean of all 64 pixels.

4. Compare the gray scale of pixels:

The gray level of each pixel is compared with the average. Greater than or equal to the average, 1; It’s less than the average. Let’s call that 0.

5. Calculate the hash value:

The results of the previous step are combined to form a 64-bit integer, which is the fingerprint of this image. The order of composition is not important, as long as all images are in the same order.

If the image is zoomed in or out, or the aspect ratio is changed, the resulting value does not change. Increasing or decreasing brightness or contrast, or changing color, doesn’t make much difference to the hash value. Biggest advantage: calculation speed is fast!

After completing the above steps, an image is equivalent to having its own “fingerprint”, and then it is necessary to calculate the number of different bits, namely the Hamming distance (for example, the Hamming examples of 1010001 and 1011101 are 2, namely different numbers).

If the Hamming distance is less than 5, it is somewhat different, but close, and if the Hamming distance is greater than 10, it is a completely different picture.

The above is the basic idea of mean hashing, generally speaking, is relatively simple.

C # realize

public class ImageHashHelper
{
    /// <summary>
    ///Get thumbnails
    /// </summary>
    /// <returns></returns>
    private static Bitmap GetThumbImage(Image image, int w, int h)
    {
        Bitmap bitmap = new Bitmap(w, h);
        Graphics g = Graphics.FromImage(bitmap);
        g.DrawImage(image,
            new Rectangle(0.0, bitmap.Width, bitmap.Height),
            new Rectangle(0.0, image.Width, image.Height), GraphicsUnit.Pixel);
        return bitmap;
    }

    /// <summary>
    ///Convert the picture to grayscale image
    /// </summary>
    /// <returns></returns>
    private static Bitmap ToGray(Bitmap bmp)
    {
        for (int i = 0; i < bmp.Width; i++)
        {
            for (int j = 0; j < bmp.Height; j++)
            {
                // Get the RGB color of the pixel at the point
                Color color = bmp.GetPixel(i, j);
                // Use the formula to calculate the gray value
                int gray = (int)(color.R * 0.3 + color.G * 0.59 + color.B * 0.11);// Calculate method 1
                //int gray1 = (int)((color.r + color.g + color.b) / 3.0m); // Calculate method 2Color newColor = Color.FromArgb(gray, gray, gray); bmp.SetPixel(i, j, newColor); }}return bmp;
    }

    /// <summary>
    ///Get the mean hash of the picture
    /// </summary>
    /// <returns></returns>
    public static int[] GetAvgHash(Bitmap bitmap)
    {
        Bitmap newBitmap = ToGray(GetThumbImage(bitmap, 8.8));
        int[] code = new int[64];
        // Calculate the gray average of all 64 pixels.
        List<int> allGray = new List<int> ();for (int row = 0; row < bitmap.Width; row++)
        {
            for (int col = 0; col < bitmap.Height; col++) { allGray.Add(newBitmap.GetPixel(row, col).R); }}double avg = allGray.Average(a => a);// Get the average value
        // Compare the grayscale of pixels
        for (int i = 0; i < allGray.Count; i++)
        {
            code[i] = allGray[i] >= avg ? 1 : 0;// Combine the comparison results
        }
        // Return the result
        return code;
    }

    /// <summary>
    ///Compare the two AvgHash
    /// </summary>
    /// <returns></returns>
    public static int Compare(int[] code1, int[] code2)
    {
        int v = 0;
        for (int i = 0; i < 64; i++)
        {
            if(code1[i] == code2[i]) { v++; }}returnv; }}Copy the code

Here, we directly obtain the gray values of 64 pixels through R in the GetAvgHash function, because RGB is the same, so any one can be used.

Gray value conversion: baike.baidu.com/item/ gray value /10…

Effect demonstration:

1. Original image search

2, complete Mosaic search

Source code download:

Click download source

The last

Mean hashing is suitable for thumbnails to find the original image, but not for people matching.

References:

www.hackerfactor.com/blog/index….