Welcome to search “little monkey technical notes” to follow my public account, you can timely communicate with me if you have any questions.

In our daily life, we must have experienced the similar scenario: when applying for an exam to upload a photo, the website requires that the photo should not be larger than how much, and the requirement is “.jpg “format.

Welcome to search “little monkey technical notes” to follow my public account, you can timely communicate with me if you have any questions.

So you happily upload your prettiest photos, but the site tells you that the photos are formatted incorrectly and asks you to re-upload. This time the heart does not know how many doubts welled up in my mind (actually is grass mud horse in pentium) my photo is clearly “.jpg “ending, and the size also conforms to the specification, why not?

We often assume (in the case of A Windows computer, the Mac doesn’t know, because I don’t have an image) that the “.jpg “image must end in the canonical” JPG “file type. In fact, I thought so at first, until a few days ago, I stepped in the docking project when a big hole, a big hole!

My docking project required a file whose image was “JPG” and base64 encoded with a file beginning with “/9j”. So I uploaded what appeared to be a conforming image saved on my computer, and it turned out to be a bunch of error messages. So I tried again, changing some other images for testing, and found that some worked, and some didn’t. To be honest, I fell apart inside! It feels like you know pictures

When I got home I was wondering why I asked for something that started with “/ 9J “? I opened Baidu, after entering the keyword “/9j”, hehe! I laughed. What are they? This has nothing to do with my problem!

What the hell? What the hell is this? Even the mighty Baidu did not give results! In this way, I search to 12 o ‘clock……

I couldn’t take it anymore, so I went to bed. But lying in bed I toss and turn, open the mobile phone to continue to search…… All of a sudden! I read a computer picture file header information analysis of the article! A light flashed across my forehead. So I got up, silently opened the computer, opened baidu……

The original computer in storage is to store the basic information of the picture, such as what type of picture, the basic information of the width and height of the picture, these basic information is called picture header information. All right! Forgive my ignorance, once I naively thought it was according to the file suffix name.

We should know that the image is stored in the computer as a pixel by pixel, the bottom layer is also a binary file, so the file header is needed to hold the file information. After searching for information, I found the following header identification information (hexadecimal identification) for different formats of pictures:

BMP file header (2 bytes) 42 4D 2.PNG file header (8 bytes) 89 50 4E 47 0D 0A 1A 0A 3.GIF file header (6 bytes) 47 49 46 38 4.JPEG/JPG header identifier (2 bytes): FF D8 (SOI) (JPEG file identifier) SOI saved an image on my computer with the suffix “.jpg “and opened it with UE, a powerful tool, just as I expected, to see the contents of this file.

If you’re not surprised, you’re not going to be able to read these things, because these are hexadecimal files. But the important thing I’ve highlighted for you is “FF D8”.

If we have a keyword like “Hello” for base64 encoding, we need to first convert “Hello” to binary, that is, “110100011001011101100110 11001101111”. I’ve given you an ASCII table here, which is in base 10, which needs to be converted from decimal to base 2.

There is a stipulation in Base64 that if a character is converted to a number that is not eight bits, zeros must be added in the highest digit, then intercepted in six bits, and finally, zeros must be added in the lowest digit that is not six bits. Then the segmented base 2 is converted to base 10 and parsed against base64 encoding table. The parsing of “hello” is as follows:

So the result of encoding “Hello” base64 is “aGVsbG8=”. In case you’re wondering why there’s an “=”, this is a base64 stipulation that one or two “=” are added automatically after coding.

So back to “FF D8”, the identity header of the JPG file, what is it after it has been base64 transcoded?

Thank god, I can see why it starts with /9j. There’s actually another way to quickly check if it’s a JPG file. We can use notepad to open a JPG file.

When you open it up, you’re not going to understand any of this, but the important thing is that I’ve labeled it for you, and that’s “JFIF,” and this is a very important label, “JFIF” is “JPEG File Interchonge Format, “the JPEG File interchange Format.

In order to restore the file which clearly had the suffix “.jpg “before, but failed to identify the problem. We took a “.png “image and changed it to”.jpg “by changing the suffix. Then use Notepad to open and view the contents of the file.As you can see, it is not “JFIF”, so this is not a JPG file, so the upload is not recognized.

If you go to bed with your problems, you can’t sleep! Through this experience, I know the coding principle of Base64, and understand that files stored in the computer are not distinguished by simple suffix names, but file header information. The header information determines what a file is. So, your later program to determine the type of file do not just judge the suffix name is done!