Sometimes you open up a file, and you see things that are displayed and you look at them separately, and you recognize them, but you don’t recognize them together. For example: iron kun jin kao

The simple understanding is that there is a problem with the way the file is opened, such as creating a document with WPS,

Then open it with notepad

The deeper understanding is that encoding and decoding are inconsistent

For example, if you press the same key in English and pinyin, the output will be different.

When I move bricks, I often have trouble switching input methods back and forth.

The ASCII character set

The computer originated in America

They store data in binary

Each 0 or 1 is called a digit

And eight bits make up a byte

Each bit can identify two states

Each byte identifies 2^8 = 256 states

So what’s the use of these states?

Americans put the 128 states from 0 to 127

All mapped to various characters

Includes upper – and lower-case letters, punctuation marks, and numbers

This is often referred to as the ASCII character set

ASCII extended character set

Later,

As other countries began to use computers

Their letters are not in ASCII

They take advantage of the remaining 128-255 bits

This is called the ASCII extended Character set

ASCII Chinese extension

However,

It’s not that simple

By the time the Chinese started using computers

There’s no room left

Do how?

Wise Chinese people

Unceremoniously removed the code after 127 bits

It takes 2 bytes to encode a Chinese character

In this way

The coding problem of more than 7,000 commonly used Chinese characters was solved

This is the GB2312 code

It is a Chinese extension of ASCII

GBK

GB2312 is not enough to do

This extended to GBK

Not only contains all the content of GB2312

It contains more than 20,000 others

New Chinese characters (including traditional Characters)

GB18030

And then minorities used computers

GBK expanded into GB18030

This series of Chinese character coding standards is commonly known as

Because each country has its own standards

Uniform standards have been developed for accurate use

unicode

Unicode was created to

To incorporate all the symbols in the world

Each symbol is given a unique code

This will solve the problem of garbled characters

(Although it hasn’t been solved yet)

References:

Mp.weixin.qq.com/s/CSsN5rd-a…