This article is participating in “Java Theme Month – Java Debug Notes Event”, see < Event link > for more details.

Java object serialization is one of a pioneering set of features introduced in JDK1.1 as a mechanism for converting the state of Java objects into byte arrays for storage or transfer, which can later still be converted back to the original state of Java objects. Java object serialization is one of a groundbreaking set of features introduced in JDK1.1. The mechanics and principles of Java object serialization were previously covered in 51CTO. Here we’ll use Person to discover five things you might not know about Java object serialization.

In fact, the idea of serialization is to “freeze” object state, transfer object state (write to disk, transfer over the network, and so on), and then “unfreeze” the state to retrieve usable Java objects. All of these things happen is a bit like magic, the ObjectInputStream/ObjectOutputStream classes, complete fidelity of metadata, and programmers are willing to use Serializable marker interface marking their class, thus “involved in this process. Listing 1 shows a Person class that implements Serializable.

Listing 1. SerializablePerson

package com.tedneward; publicclassPerson implementsjava.io.Serializable{publicPerson(Stringfn,Stringln,inta){this.firstName=fn; this.lastName=ln; this.age=a; } publicStringgetFirstName(){returnfirstName; }publicStringgetLastName(){returnlastName; }publicintgetAge(){returnage; } publicPersongetSpouse(){returnspouse; } publicvoidsetFirstName(Stringvalue){firstName=value; }publicvoidsetLastName(Stringvalue){lastName=value; } publicvoidsetAge(intvalue){age=value; }publicvoidsetSpouse(Personvalue){spouse=value; } publicStringtoString(){return "[Person:firstName="+firstName+ "lastName="+lastName+ "age="+age+ "Spouse =" + spouse. GetFirstName () + "] "; } privateStringfirstName; privateStringlastName; privateintage; privatePersonspouse; }Copy the code

Once you serialize Person, it’s easy to write the object state to disk and then re-read it, as demonstrated in the JUnit4 unit test below.

Listing 2. Deserialize Person

PublicclassSerTest {@ TestpublicvoidserializeToDisk () {try {com. Tedneward. Personted = b. bold. Tedneward. Person (" Ted ", "Neward", 3 9); Com. Tedneward. Personcharl = b. bold. Tedneward. Person (" Charlotte ", "Neward", 38). ted.setSpouse(charl); charl.setSpouse(ted); FileOutputStreamfos = newFileOutputStream (" tempdata. Ser "); ObjectOutputStreamoos=newObjectOutputStream(fos); oos.writeObject(ted); oos.close(); {} the catch (Exceptionex) fail (" Exceptionthrownduringtest: "+ ex. ToString ()); } to try {FileInputStreamfis = newFileInputStream (" tempdata. Ser "); ObjectInputStreamois=newObjectInputStream(fis); com.tedneward.Personted=(com.tedneward.Person)ois.readObject(); ois.close(); AssertEquals (ted getFirstName (), "ted"); AssertEquals (ted. GetSpouse (.) getFirstName (), "Charlotte"); / / Cleanupthefile newFile (" tempdata. Ser "). The delete (); {} the catch (Exceptionex) fail (" Exceptionthrownduringtest: "+ ex. ToString ()); }}}Copy the code

So far, nothing new or exciting has been seen, but it’s a good place to start.

1. Serialization allows refactoring

Serialization allows a certain number of class variants, even after refactoring, and ObjectInputStream will still read them out nicely. The key tasks that the JavaObjectSerialization specification can automatically manage are:

◆ Add the new field to the class.

◆ Change a field from static to non-static.

◆ Change the field from TRANSIENT to non-transient.

◆ Depending on the desired level of backward compatibility, converting a field form (from non-static to static or from non-transient to TRANSIENT) or deleting a field requires additional messaging.

Refactoring serialized classes

Now that you know that serialization allows for refactoring, let’s take a look at what happens when a new field is added to the Person class. As shown in Listing 3, PersonV2 introduces a new field for gender in addition to the original Person class.

Listing 3. Add the new field to the serialized Person

enumGender{MALE,FEMALE}publicclassPerson implementsjava.io.Serializable{publicPerson(Stringfn,Stringln,inta,Genderg){this.firstName=fn; this.lastName=ln; this.age=a; this.gender=g; }publicStringgetFirstName(){returnfirstName; }publicStringgetLastName(){returnlastName; }publicGendergetGender(){returngender; } publicintgetAge(){returnage; } publicPersongetSpouse(){returnspouse; } publicvoidsetFirstName(Stringvalue){firstName=value; }publicvoidsetLastName(Stringvalue){lastName=value; } publicvoidsetGender(Gendervalue){gender=value; }publicvoidsetAge(intvalue){age=value; } publicvoidsetSpouse(Personvalue){spouse=value; }publicStringtoString(){return "[Person:firstName="+firstName+ "lastName="+lastName+ "gender="+gender+ "age="+age+" "Spouse =" + spouse. GetFirstName () + "] "; } privateStringfirstName; privateStringlastName; privateintage; privatePersonspouse; privateGendergender; }Copy the code

Serialization uses a hash that is calculated from almost everything in a given source file — method name, field name, field type, access modification method, and so on — and serialization compares that hash value to the hash value in the serialized stream.

In order to convince the Java runtime two types are actually the same, the second edition and the subsequent versions of the Person must have the same serialization version hash with the first edition (stored as privatestaticfinalserialVersionUID field). Therefore, we need the serialVersionUID field, which is calculated by running the JDKserialver command against the original (or V1) version of the Person class.

Once you have Person’s serialVersionUID, not only can you create a PersonV2 object from the serialized data of the original object Person (the new field is set to its default value when a new field appears, most commonly “NULL”), you can also do the reverse: Not surprisingly, the data from PersonV2 is deserialized to get Person. 2. Serialization is not secure

To the surprise and annoyance of Java developers, the serialized binary format is written entirely in documentation and is completely reversible. In fact, simply dumping the contents of the binary serialized stream to the console is enough to see what the class looks like and what it contains.

This has a negative impact on security. For example, when a remote method call is made over RMI, any private field in an object sent over a connection appears almost as clear text in the socket stream, which obviously invites even the simplest security issues.

Fortunately, serialization allows you to “hook” the serialization process and protect (or obfuscate) field data before and after serialization. You can do this by providing a writeObject method on the Serializable object.

Fuzzily serialize data

Assume that the sensitive data in the Person class is the Age field. After all, women don’t talk about age. We can obfuscate the data before serialization, move the digit loop one bit to the left, and then reset it after deserialization. (You can develop more secure algorithms; this one is just an example.)

To “hook” the serialization process, we’ll implement a writeObject method on Person; To “hook” the deserialization process, we will implement a readObject method on the same class. It’s important to get the details of the two methods right — if access to the modified method, parameter, or name is different from what’s in Listing 4, the code will fail unnoticed and the age of Person will be exposed.

Listing 4. Fuzzifying the serialized data

publicclassPerson implementsjava.io.Serializable{ publicPerson(Stringfn,Stringln,inta) {this.firstName=fn; this.lastName=ln; this.age=a; } publicStringgetFirstName(){returnfirstName; }publicStringgetLastName(){returnlastName; } publicintgetAge(){returnage; }publicPersongetSpouse(){returnspouse; }publicvoidsetFirstName(Stringvalue){firstName=value; } publicvoidsetLastName(Stringvalue){lastName=value; }publicvoidsetAge(intvalue){age=value; }publicvoidsetSpouse(Personvalue){spouse=value; } privatevoidwriteObject (Java. IO. ObjectOutputStreamstream) throwsjava. IO. IOException {/ / "Encrypt"/obscurethesensitivedata  ageage=age<<2; stream.defaultWriteObject(); } privatevoidreadObject(java.io.ObjectInputStreamstream) throwsjava.io.IOException,ClassNotFoundException{stream.defaultReadObject(); / / "Decrypt"/DE - obscurethesensitivedata ageage = age < < 2; } publicStringtoString(){return "[Person:firstName="+firstName+ "lastName="+lastName+ "age="+age+ "Spouse =" + (spouse! = null? Spouse. GetFirstName () : "[null]]," ") + "; } privateStringfirstName; privateStringlastName; privateintage; privatePersonspouse; }Copy the code

If you need to see obfuscated data, you can always look at serialized data streams/files. Also, because the format is fully documented, the contents of a serialized stream can be read even if the class itself is not accessible.

3. Serialized data can be signed and sealed

The previous tip assumes that you want to obfuscate the serialized data, rather than encrypt it or ensure that it is not modified. Of course, password encryption and signature management can be achieved by using writeObject and readObject, but there is a better way.

If you need to encryption and signature of the entire object, the simplest is put it in a javax.mail crypto. SealedObject and/or Java. Security. The SignedObject wrapper. Both are serializable, so wrapping an object in SealedObject creates a kind of “box” around the original object. A symmetric key is required for decryption, and the key must be managed separately. Similarly, SignedObject can also be used for data validation, and symmetric keys must also be managed separately. Together, these two objects make it easy to seal and sign serialized data without stressing the details of digital signature verification or encryption. Pretty neat, right?

4. Serialization allows the proxy to be placed in a stream

In many cases, a class contains a core data element from which other fields in the class can be derived or found. In this case, there is no need to serialize the entire object. Fields can be marked transient, but each time a method accesses a field, the class must still explicitly generate code to check that it has been initialized.

If serialization is the primary issue, it is best to specify a flyweight or proxy to put in the stream. Provide the original Person with a writeReplace method that can serialize objects of different types in its place. Similarly, if a readResolve method is found during deserialization, it is called to provide the replacement object to the caller.

Package and unpack agent

The writeReplace and readResolve methods enable the Person class to package all of its data (or the core data in it) into a PersonProxy, place it in a stream, and unpack it at deserialization time.

Listing 5. You complete me and I replace you

classPersonProxy implementsjava.io.Serializable{ publicPersonProxy(Personorig){ Data = orig. GetFirstName () + ", "+ orig. GetLastName () +", "+ orig. GetAge (); if(orig.getSpouse()! =null){Personspouse=orig.getSpouse(); Datadata = data + ", "+ spouse. GetFirstName () +", "+ spouse. GetLastName () +", "+ spouse. GetAge (); } } publicStringdata; PrivateObjectreadResolve () throwsjava. IO. ObjectStreamException {String [] pieces = data. The split (", "); Personresult=newPerson(pieces[0],pieces[1],Integer.parseInt(pieces[2])); if(pieces.length>3) {result.setSpouse(newPerson(pieces[3],pieces[4],Integer.parseInt (pieces[5]))); result.getSpouse().setSpouse(result); }returnresult; } } publicclassPerson implementsjava.io.Serializable{ publicPerson(Stringfn,Stringln,inta){this.firstName=fn; this.lastName=ln; this.age=a; }publicStringgetFirstName(){returnfirstName; }publicStringgetLastName(){returnlastName; }publicintgetAge(){returnage; }publicPersongetSpouse(){returnspouse; }privateObjectwriteReplace()throwsjava.io.ObjectStreamException {returnnewPersonProxy(this); }publicvoidsetFirstName(Stringvalue){firstName=value; } publicvoidsetLastName(Stringvalue){lastName=value; } publicvoidsetAge(intvalue){age=value; } publicvoidsetSpouse(Personvalue){spouse=value; } publicStringtoString(){return "[Person:firstName="+firstName+ "lastName="+lastName+ "age="+age+ "Spouse =" + spouse. GetFirstName () + "] "; } privateStringfirstName; privateStringlastName; privateintage; privatePersonspouse; }Copy the code

Note that PersonProxy must keep track of all data for Person. This usually means that the proxy needs to be an internal class of Person in order to access the private fields. Sometimes, the agent also needs to track down other object references and manually serialize them, such as Person’s spouse.

This technique is one of the few that does not require a read/write balance. For example, a version of a class that has been reconstituted to another type can provide a readResolve method to silently convert serialized objects to the new type. Similarly, it can serialize an old class to a new version using the writeReplace method.

5. Trust, but verify

It is ok to assume that the data in the serialized stream is always the same as the data originally written to the stream. But, as a former US President put it, “trust, but verify”.

For serialized objects, this means validating the fields to ensure that they still have the correct values after deserialization, “just in case.” To do this, you implement the ObjectInputValidation interface and override the validateObject() method. If an error is found somewhere when the method is called, an InvalidObjectException is thrown.

conclusion

Java object serialization is more flexible than most Java developers realize, which gives us more opportunities to solve tricky situations. Fortunately, programming tricks like these are ubiquitous in the JVM. The key is to know them and use them when you get stuck