This article was originally intended to contain two parts. The first part was to record the recent base64-related bugs encountered, and the second part was to explain the principle of Base64 encoding in detail. And I found out in the middle of it, eh? How can it take so long to talk about something that is not complicated? Not good for reading and comprehension (actually a little lazy today and want to have some fun), so I will bring you the detailed explanation of Base64 encoding in the next article.

0x01 Encountered phenomenon

A provides an interface to B. The interface parameters are encoded in Base64 and passed.

But the Base64 decoding of the parameter passed by B is incorrect:

Illegal base64 character a
Copy the code

0x02 Cause Analysis

This is a pit that a lot of Internet users have trod over. In short, there are different implementations of the Base64 codec, some of which are incompatible with each other.

For example, the phenomenon I encountered above can be fully simulated using the following code:

package org.mazhuang.base64test;

import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.util.Base64Utils;
import sun.misc.BASE64Encoder;

@SpringBootApplication
public class Base64testApplication implements CommandLineRunner {
    @Override
    public void run(String... args) throws Exception {
        byte[] content = "It takes a strong man to save himself, and a great man to save another.".getBytes();
        String encrypted = new BASE64Encoder().encode(content);
        byte[] decrypted = Base64Utils.decodeFromString(encrypted);
        System.out.println(new String(decrypted));
    }

    public static void main(String[] args) { SpringApplication.run(Base64testApplication.class, args); }}Copy the code

The above code execution will report an exception:

Caused by: java.lang.IllegalArgumentException: Illegal base64 character a
	at java.util.Base64$DecoderDecode0 (Base64. Java: 714) ~ [na: 1.8.0 comes with _202 - release] at Java. Util. Base64$DecoderDecode (Base64. Java: 526) ~ [na: 1.8.0 comes with _202 - release]Copy the code

Note:If the string in the test code is very short, such as “Hello, World”, it will decode properly.

That is to say, with the sun. The misc. BASE64Encoder coding, with org. Springframework. Util. Base64Utils decoding, is problematic, we can use it both respectively for the above character string coding, and then output to see differences. Test code:

byte[] content = "It takes a strong man to save himself, and a great man to save another.".getBytes();

System.out.println(new BASE64Encoder().encode(content));
System.out.println("-- Gorgeous dividing line --");
System.out.println(Base64Utils.encodeToString(content));
Copy the code

Output:

SXQgdGFrZXMgYSBzdHJvbmcgbWFuIHRvIHNhdmUgaGltc2VsZiwgYW5kIGEgZ3JlYXQgbWFuIHRv IHNhdmUgYW5vdGhlci4 = luxuriant separation line -- -- - SXQgdGFrZXMgYSBzdHJvbmcgbWFuIHRvIHNhdmUgaGltc2VsZiwgYW5kIGEgZ3JlYXQgbWFuIHRvIHNhdmUgYW5vdGhlci4=Copy the code

You can see that sun.misc.BASE64Encoder has a newline, and the ASCII encoding of the newline is exactly 0x0A, which makes sense. Let’s take a closer look at the source of the discrepancy.

0x03 Further

To jump to their implementation, press CTRL or COMMAND in IDEA and click on the method name.

3.1 sun. Misc. BASE64Encoder. Encode

This method mainly involves two classes, BASE64Encoder and CharacterEncoder under sun.misc, of which the latter is the parent class of the former.

The actual working encode method is in the CharacterEncoder file, the annotated version is as follows:


public void encode(InputStream inStream, OutputStream outStream)
    throws IOException {
    int     j;
    int     numBytes;
    // bytesPerLine is implemented in BASE64Encoder and returns 57
    byte    tmpbuffer[] = new byte[bytesPerLine()];

    // Construct a PrintStream from outStream
    encodeBufferPrefix(outStream);

    while (true) {
        // Read a maximum of 57 bytes
        numBytes = readFully(inStream, tmpbuffer);
        if (numBytes == 0) {
            break;
        }
        // Nothing
        encodeLinePrefix(outStream, numBytes);
        // Each processing of 3 bytes, encoding into 4 bytes, the insufficient bits complement 0 and '='
        for (j = 0; j < numBytes; j += bytesPerAtom()) {
            // ...
        }
        if (numBytes < bytesPerLine()) {
            break;
        } else {
            / / a newlineencodeLineSuffix(outStream); }}// Nothing
    encodeBufferSuffix(outStream);
}
Copy the code

Then in the CharacterEncoder class comment we can see the encoded format:

[Buffer Prefix]
[Line Prefix][encoded data atoms][Line Suffix]
[Buffer Suffix]
Copy the code

In BASE64Encoder, Buffer Prefix, Buffer Suffix, and Line Prefix are empty, and Line Suffix is \n.

At this point, we have found the newline part of the implementation — the encoder implementation that reads 57 bytes as a line to encode (76 bytes after encoding).

3.2 org. Springframework. Util. Base64Utils. EncodeToString

The writing is mainly involves the org. Springframework. Util. Base64Utils and Java. Util. Base64 two classes, you can see the former is mainly the latter encapsulation.

Base64Utils. EncodeToString used this kind of writing the final is Base64 Encoder. RFC4648 the Encoder:

// isURL = false, newLine = null, linemax = -1, doPadding = true
static final Encoder RFC4648 = new Encoder(false.null, -1.true);
Copy the code

Note the values of newline and linemax.

Then look at the base64.encode0 method where the actual encoding implementation is:

private int encode0(byte[] src, int off, int end, byte[] dst) {
    // ...
    while (sp < sl) {
        // ...

        // This condition will not be satisfied. No newlines will be added
        if (dlen == linemax && sp < end) {
            for (byteb : newline){ dst[dp++] = b; }}}// ...
    return dp;
}
Copy the code

So… There are no line breaks in this implementation.

0 x04 summary

After the above analysis, the truth has come to light, that is, the implementation of the two encoders is not the same, we pay attention to the use of matching encoding decoder in the development process is OK, is to use which Java package encoder coding, using the same package under the corresponding decoder decoding.

As for why the implementations are different, how they work, how they work, how Base64 works in detail, and so on, let’s take a look at that next time. 😛


If you are interested in my articles, you can follow my wechat public account “stuffy coders” at any time to read more content.