Why can’t Mysql utF8 store Emoji?

  • Emoji

First of all, we need to know the encoding method of Emoji. Emoji is Emoji, which means graphics, and text is the metaphor of graphics, which can be used to represent a variety of expressions, such as smiling face for laughter, cake for food, etc. The Unicode encoding is E63E to E757.

  • Mysql utf8 and UTf8mb4

The character set named utf8 uses a maximum of three bytes per character and contains only BMP characters. As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplementary characters:For a BMP character, utf8 and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.For a supplementary character, utf8 cannot store the character at all, whereas utf8mb4 requires four bytes to store it. Because utf8 cannot store the character at all, you have no supplementary characters in utf8 columns and need not worry about converting characters or losing data when upgrading utf8 data from older versions of MySQL. The UTF8MB4 character set uses a maximum of three bytes per character and contains only the BMP(Unicode) basic multilanguage plane. For Unicode, I recommend reading Wikipedia to understand what 17 flat characters mean. As of MySQL 5.5.3, the UTF8MB4 character set uses up to 4 bytes per character, supporting supplementary characters.

To sum up, the Unicode encoding corresponding to Emoji is not in the Unicode group plane corresponding to UTF8, so Mysql UTF8 cannot be directly used to store Emoji encoding format characters.

There are several ways to get Mysql to support Emoji

  1. Change Mysql surface to UTf8_mb4

    There are a lot of tutorials on the Internet that I'm not going to explain in detail. This way I feel very inconvenient, but also need to modify the database configuration, restart; Restarting a database for an online production environment can be costly and risky.Copy the code
  2. The server uses Base64 transform Emoji encoding

    • The server uses Base64 compression for Emoji
    String mysqlColumn = MimeUtility.encodeWord(emojiStr);
    Copy the code
    • Decodes backwards the Base64 encoded string of the data store
    String emojiStr = MimeUtility.decodeWord(mysqlColumn);
    Copy the code

    Instead of using Base64 transformations, we only need to encode and decode on the server side.

conclusion

I also encountered database storage Emoji error, in order to solve this problem, online search information generally these points, give the following two articles, recommended to read, understand MySql in UTF8 why can not store Emoji, understand the basic knowledge of Unicode.

Refer to the article

  • 1.The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding)
  • 2.Unicode
  • 3.Emoji