This is the fourth day of my participation in the November Gwen Challenge. Check out the details: The last Gwen Challenge 2021

FST concepts and definitions

FST Serialization Is the Fast Serialization Tool, which is an alternative implementation of Java Serialization. Since the two serious deficiencies of Java serialization mentioned in the previous article have been greatly improved in FST, FST features are as follows:

  • The SERIalization provided by the JDK is 10 times better and more than 3-4 times smaller
  • Support for off-heap Maps, and persistence of off-heap Maps
  • Support for serialization to JSON

Use of FST serialization

There are two ways to use FST, one is a shortcut, and the other requires ObjectOutput and ObjectInput.

Use the serialization and deserialization interfaces provided by FSTConfiguration directly

public static void serialSample(a) {
    FSTConfiguration conf = FSTConfiguration.createAndroidDefaultConfiguration();
    User object = new User();
    object.setName("huaijin");
    object.setAge(30);
    System.out.println("serialization, " + object);
    byte[] bytes = conf.asByteArray(object);
    User newObject = (User) conf.asObject(bytes);
    System.out.println("deSerialization, " + newObject);
}
Copy the code

FSTConfiguration also provides a Class interface for registering objects. If the object is not registered, the Class Name of the object is written by default. This provides an easy to use and efficient API way to get byte[] without using ByteArrayOutputStreams.

With ObjectOutput and ObjectInput, you can more subtly control serialized writes:

static FSTConfiguration conf = FSTConfiguration.createAndroidDefaultConfiguration();
static void writeObject(OutputStream outputStream, User user) throws IOException {
    FSTObjectOutput out = conf.getObjectOutput(outputStream);
    out.writeObject(user);
    out.close();
}

static FstObject readObject(InputStream inputStream) throws Exception {
    FSTObjectInput input = conf.getObjectInput(inputStream);
    User fstObject = (User) input.readObject(User.class);
    input.close();
    return fstObject;
}
Copy the code

Application of FST in Dubbo

  • The repackaging of FstObjectInput and FstObjectOutput in Dubbo solves the problem of serializing and deserializing null Pointers.

  • The FstFactory factory class is constructed to generate FstObjectInput and FstObjectOutput using the factory pattern. The singleton mode is used to control FstConfiguration as a singleton in the entire application, and all objects to be serialized are registered with FstConfiguration during initialization.

  • FstSerialization provides the same serialization interface to provide serialize and deserialize capabilities.

FST serialization/deserialization

FST serialized storage format

Almost all serialized objects stored in Byte form are similar storage structures, regardless of the class file, so file, dex file. There are no innovative formats, but some compression optimization for field content, including the utF-8 encoding that we use most often.

FST’s serialization storage and general byte formatting storage schemes are also nothing new, such as the following FTS serialization byte file

00000001:  0001 0f63 6f6d 2e66 7374 2e46 5354 4265
00000010:  616e f701 fc05 7630 7374 7200 
Copy the code

Format:

The Header | | class name length class name String type 1 | field (1 byte) | (length) type 2 | | content field (1 byte) | (length) | | content...Copy the code
  • 0000: byte array type: 00 Identifies OBJECT
  • 0001: indicates the encoding of the class name. 00 indicates the UTF encoding and 01 indicates the ASCII encoding
  • 0002: Length of Class Name (1Byte) = 15
  • 0003~0011: Class name string (15Byte)
  • 0012: Integer The type identifier is 0xf7
  • 0013: The value of Integer =1
  • 0014: String Type identifier 0xFC
  • 0015: The String length is 5
  • 0016~ 001A: String value “v0str”
  • 001 b – 001 – c: the END

FSTObjectInput#instantiateSpecialTag instantiateSpecialTag {FSTObjectInput#instantiateSpecialTag} FSTObjectInput also defines enumerations of different types:

public class FSTObjectOutput implements ObjectOutput {
    private static final FSTLogger LOGGER = FSTLogger.getLogger(FSTObjectOutput.class);
    public static Object NULL_PLACEHOLDER = new Object() { 
    public String toString(a) { return "NULL_PLACEHOLDER"; }};
    public static final byte SPECIAL_COMPATIBILITY_OBJECT_TAG = -19; // see issue 52
    public static final byte ONE_OF = -18;
    public static final byte BIG_BOOLEAN_FALSE = -17;
    public static final byte BIG_BOOLEAN_TRUE = -16;
    public static final byte BIG_LONG = -10;
    public static final byte BIG_INT = -9;
    public static final byte DIRECT_ARRAY_OBJECT = -8;
    public static final byte HANDLE = -7;
    public static final byte ENUM = -6;
    public static final byte ARRAY = -5;
    public static final byte STRING = -4;
    public static final byte TYPED = -3; // var class == object written class
    public static final byte DIRECT_OBJECT = -2;
    public static final byte NULL = -1;
    public static final byte OBJECT = 0;
    protectedFSTEncoder codec; . }Copy the code

FST serialization and deserialization principles

If the Bean’s definition is changed during unsequence, then the deserializer needs to provide a compatible solution. We know that for JDK serialization and unsequence, serialVersionUID plays an important role in version control. FST’s solution to this problem is to sort through the @version annotation.

When performing an antisequence operation, FST reflects all the members of the Class or object and sorts them. This sorting is crucial for compatibility, which is how @version works. In FSTClazzInfo we define a defFieldComparator that sorts all the fields of the Bean:

public final class FSTClazzInfo {
    public static final Comparator<FSTFieldInfo> defFieldComparator = new Comparator<FSTFieldInfo>() {
        @Override
        public int compare(FSTFieldInfo o1, FSTFieldInfo o2) {
            int res = 0;

            if( o1.getVersion() ! = o2.getVersion() ) {return o1.getVersion() < o2.getVersion() ? -1 : 1;
            }

            // order: version, boolean, primitives, conditionals, object references
            if (o1.getType() == boolean.class && o2.getType() ! =boolean.class) {
                return -1;
            }
            if(o1.getType() ! =boolean.class && o2.getType() == boolean.class) {
                return 1;
            }

            if(o1.isConditional() && ! o2.isConditional()) { res =1;
            } else if(! o1.isConditional() && o2.isConditional()) { res = -1;
            } else if(o1.isPrimitive() && ! o2.isPrimitive()) { res = -1;
            } else if(! o1.isPrimitive() && o2.isPrimitive()) res =1;
// if (res == 0) // 64 bit / 32 bit issues
// res = (int) (o1.getMemOffset() - o2.getMemOffset());
            if (res == 0)
                res = o1.getType().getSimpleName().compareTo(o2.getType().getSimpleName());
            if (res == 0)
                res = o1.getName().compareTo(o2.getName());
            if (res == 0) {
                return o1.getField().getDeclaringClass().getName().compareTo(o2.getField().getDeclaringClass().getName());
            }
            returnres; }}; . }Copy the code

The FSTObjectInput#instantiateAndReadNoSer method should not be instantiated on objects, but should not be instantiated on objects. FSTObjectInput#instantiateAndReadNoSer

public class FSTObjectInput implements ObjectInput {
	protected Object instantiateAndReadNoSer(Class c, FSTClazzInfo clzSerInfo, FSTClazzInfo.FSTFieldInfo referencee, int readPos) throws Exception { Object newObj; newObj = clzSerInfo.newInstance(getCodec().isMapBased()); . }else {
            FSTClazzInfo.FSTFieldInfo[] fieldInfo = clzSerInfo.getFieldInfo();
            readObjectFields(referencee, clzSerInfo, fieldInfo, newObj,0.0);
        }
        return newObj;
    }

    protected void readObjectFields(FSTClazzInfo.FSTFieldInfo referencee, FSTClazzInfo serializationInfo, FSTClazzInfo.FSTFieldInfo[] fieldInfo, Object newObj, int startIndex, int version) throws Exception {
        
        if ( getCodec().isMapBased() ) {
            readFieldsMapBased(referencee, serializationInfo, newObj);
            if ( version >= 0 && newObj instanceof Unknown == false)
                getCodec().readObjectEnd();
            return;
        }
        if ( version < 0 )
            version = 0;
        int booleanMask = 0;
        int boolcount = 8;
        final int length = fieldInfo.length;
        int conditional = 0;
        for (int i = startIndex; i < length; i++) {	// Notice the loop here
            try {
                FSTClazzInfo.FSTFieldInfo subInfo = fieldInfo[i];
                if (subInfo.getVersion() > version ) {	 // Need to go to the next iteration
                    int nextVersion = getCodec().readVersionTag();	// The next version of the object stream
                    if ( nextVersion == 0 ) // old object read
                    {
                        oldVersionRead(newObj);
                        return;
                    }
                    if( nextVersion ! = subInfo.getVersion() ) {// The version of the same Field cannot be changed, and the version change is synchronized with the version of the stream
                        throw new RuntimeException("read version tag "+nextVersion+" fieldInfo has "+subInfo.getVersion());
                    }
					readObjectFields(referencee,serializationInfo,fieldInfo,newObj,i,nextVersion);	// Start recursion for the next Version
                    return;
                }
                if (subInfo.isPrimitive()) {
                	...
                } else {
                    if ( subInfo.isConditional() ) {
                    	...
                    }
                   	// object Saves the read value to FSTFieldInfoObject subObject = readObjectWithHeader(subInfo); subInfo.setObjectValue(newObj, subObject); }...Copy the code

Filed cycles are made according to sorted records, and each FSTFieldInfo records its location, type and other detailed information in the object stream:

Serialization:

  • Sort all fields of the Bean by Version (excluding static and transient members). Fields without @version annotations default to Version =0; If version is the same, order by version, Boolean, Primitives, conditionals, Object References
  • The fields of the beans are written to the output stream one by one according to the sorted fields
  • The Version of @version can only be added but not decreased. If the Version is equal, it may cause the order of Filed in the stream and the order of FSTFieldInfo[] in the memory to be inconsistent due to the default sorting rules

Deserialization:

  • Deserialization is parsed in the format of the object stream, where the order of fields saved is the same as the order of FSTFieldInfo in memory
  • The same version of Field is present in the object stream, missing in the memory Bean: exceptions may be thrown (there will be backward compatibility issues)
  • Object stream contains a higher version of the Field not found in the memory Bean: normal (old version compatible with new)
  • The same version of Field is missing in the object stream and present in the in-memory Bean: an exception is thrown
  • Different versions of the same Field in the object stream and in the memory Bean: exception thrown
  • Memory beans add no higher than the maximum version of Field: throw exceptions

The rule for using @version is that every new Field is annotated with @version, and the value of Version is set to the maximum value of the current Version plus one. It is not allowed to delete fields

Also look at the comments for the @version annotation: it explicitly states that it is used for backward compatibility

package org.nustaq.serialization.annotations;

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.FIELD})

/** * support for adding fields without breaking compatibility to old streams. * For each release of your app increment the version value. No Version annotation means version=0. * Note that each added field needs to be annotated. * * e.g. * * class MyClass implements Serializable {* * / fields on Initial release 1.0 * int x; * String y; * * // fields added with release 1.5 *@Version(1) String added;
 *     @Version(1) String alsoAdded;
 *
 *     // fields added with release 2.0
 *     @Version(2) String addedv2;
 *     @Version(2) String alsoAddedv2; * * } * * If an old class is read, new fields will be set to default values. You can register a VersionConflictListener * at FSTObjectInput in order to fill in defaults for new fields. * * Notes/Limits: * - Removing fields will break backward compatibility. You can only Add new fields. * - Can slow down serialization over  time (if many versions) * - does not work for Externalizable or Classes which make use of JDK-special features such as readObject/writeObject * (AKA does not work if fst has to fall back to 'compatible mode' for an object). * - in case you  use custom serializers, your custom serializer has to handle versioning * */
public @interface Version {
    byte value(a);
}
Copy the code
public class FSTBean implements Serializable {
    /** serialVersionUID */
    private static final long serialVersionUID = -2708653783151699375L;
    private Integer v0int
    private String v0str;
}
Copy the code

Prepare serialization and deserialization methods

public class FSTSerial {

    private static void serialize(FstSerializer fst, String fileName) {
        try {
            FSTBean fstBean = new FSTBean();
            fstBean.setV0int(1);
            fstBean.setV0str("v0str");
            byte[] v1 = fst.serialize(fstBean);

            FileOutputStream fos = new FileOutputStream(new File("byte.bin"));
            fos.write(v1, 0, v1.length);
            fos.close();

        } catch(Exception e) { e.printStackTrace(); }}private static void deserilize(FstSerializer fst, String fileName) {
        try {
            FileInputStream fis = new FileInputStream(new File("byte.bin"));
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            byte[] buf = new byte[256];
            int length = 0;
            while ((length = fis.read(buf)) > 0) {
                baos.write(buf, 0, length);
            }
            fis.close();
            buf = baos.toByteArray();
            FSTBean deserial = fst.deserialize(buf, FSTBean.class);
            System.out.println(deserial);
            System.out.println(deserial);

        } catch(Exception e) { e.printStackTrace(); }}public static void main(String[] args) {
        FstSerializer fst = new FstSerializer();
        serialize(fst, "byte.bin");
        deserilize(fst, "byte.bin"); }}Copy the code

The resources

Github.com/RuedigerMoe…