Do you think you know Java programming? In fact, most programmers dabble with the Java platform, learning just enough to accomplish the task at hand. In this series, Ted Neward delves into the core features of the Java platform and reveals some little-known facts to help you solve your toughest programming challenges.

About this Series

About a year ago, a developer responsible for managing all the user Settings for an application decided to store the user Settings in a Hashtable, which was then serialized to disk for persistence. When the user changes the Settings, the Hashtable is written back to disk.

This was an elegant, open setup system, but it crashed when the team decided to migrate from Hashtable to HashMap in the Java Collections library.

Hashtable and HashMap have different and incompatible formats on disk. Unless you run some kind of data conversion utility for every persistent user setting (an extremely large task), it seems likely that Hashtable will always be the application’s storage format.

The team felt stuck, but only because they didn’t know an important fact about Java serialization: Java serialization allows you to change types over time. When I showed them how to automate the serialization substitution, they made the transition to HashMap as planned.

This article is the first in a series devoted to revealing some useful trivials about the Java platform — trivials that are hard to understand, but come in handy for solving Java programming challenges.

The Java Object serialization API is a good place to start, as it has been in JDK 1.1 since the beginning. This article introduces five things about serialization that will convince you to take a fresh look at those standard Java apis.

Introduction to Java Serialization

Java object serialization is one of a pioneering set of features introduced in JDK 1.1 as a mechanism for converting the state of Java objects into byte arrays for storage or transfer, which can later still be converted back to the original state of Java objects.

In fact, the idea of serialization is to “freeze” object state, transfer object state (write to disk, transfer over the network, and so on), and then “unfreeze” the state to retrieve usable Java objects. All of these things happen is a bit like magic, the ObjectInputStream/ObjectOutputStream classes, complete fidelity of metadata, and programmers are willing to use Serializable marker interface marking their class, thus “involved in this process.

Listing 1 shows a Person class that implements Serializable.

Listing 1. Serializable Person

package com.tedneward;

public class Person
    implements java.io.Serializable
{
    public Person(String fn, String ln, int a)
    {
        this.firstName = fn; this.lastName = ln; this.age = a;
    }

    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
    public int getAge() { return age; }
    public Person getSpouse() { return spouse; }

    public void setFirstName(String value) { firstName = value; }
    public void setLastName(String value) { lastName = value; }
    public void setAge(int value) { age = value; }
    public void setSpouse(Person value) { spouse = value; }

    public String toString()
    {
        return "[Person: firstName=" + firstName + 
            " lastName=" + lastName +
            " age=" + age +
            " spouse=" + spouse.getFirstName() +
            "]";
    }    

    private String firstName;
    private String lastName;
    private int age;
    private Person spouse;

}
Copy the code

Once you serialize Person, it’s easy to write the object state to disk and then re-read it, as demonstrated in the JUnit 4 unit test below.

Listing 2. Deserialize Person

public class SerTest
{
    @Test public void serializeToDisk()
    {
        try
        {
            com.tedneward.Person ted = new com.tedneward.Person("Ted"."Neward", 39);
            com.tedneward.Person charl = new com.tedneward.Person("Charlotte"."Neward"38); ted.setSpouse(charl); charl.setSpouse(ted); FileOutputStream fos = new FileOutputStream("tempdata.ser");
            ObjectOutputStream oos = new ObjectOutputStream(fos);
            oos.writeObject(ted);
            oos.close();
        }
        catch (Exception ex)
        {
            fail("Exception thrown during test: " + ex.toString());
        }

        try
        {
            FileInputStream fis = new FileInputStream("tempdata.ser"); ObjectInputStream ois = new ObjectInputStream(fis); com.tedneward.Person ted = (com.tedneward.Person) ois.readObject(); ois.close(); AssertEquals (ted getFirstName (),"Ted"); AssertEquals (ted. GetSpouse (.) getFirstName (),"Charlotte");

            // Clean up the file
            new File("tempdata.ser").delete();
        }
        catch (Exception ex)
        {
            fail("Exception thrown during test: "+ ex.toString()); }}}Copy the code

So far, nothing new or exciting has been seen, but it’s a good place to start. We’ll use Person to discover five things you might not know about Java object serialization.

1. Serialization allows refactoring

Serialization allows a certain number of class variants, even after refactoring, and ObjectInputStream will still read them out nicely. The key tasks that the Java Object Serialization specification can automatically manage are:

  • Add the new field to the class

  • Change the field from static to non-static

  • Change the field from TRANSIENT to non-transient

Depending on the degree of backward compatibility required, converting a field form (from non-static to static or from non-transient to TRANSIENT) or deleting a field requires additional messaging.

Refactoring serialized classes

Now that you know that serialization allows for refactoring, let’s take a look at what happens when a new field is added to the Person class.

As shown in Listing 3, PersonV2 introduces a new field for gender in addition to the original Person class.

Listing 3. Add the new field to the serialized Person

enum Gender
{
    MALE, FEMALE
}

public class Person
    implements java.io.Serializable
{
    public Person(String fn, String ln, int a, Gender g)
    {
        this.firstName = fn; this.lastName = ln; this.age = a; this.gender = g;
    }

    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
    public Gender getGender() { return gender; }
    public int getAge() { return age; }
    public Person getSpouse() { return spouse; }

    public void setFirstName(String value) { firstName = value; }
    public void setLastName(String value) { lastName = value; }
    public void setGender(Gender value) { gender = value; }
    public void setAge(int value) { age = value; }
    public void setSpouse(Person value) { spouse = value; }

    public String toString()
    {
        return "[Person: firstName=" + firstName + 
            " lastName=" + lastName +
            " gender=" + gender +
            " age=" + age +
            " spouse=" + spouse.getFirstName() +
            "]";
    }    

    private String firstName;
    private String lastName;
    private int age;
    private Person spouse;
    private Gender gender;
}
Copy the code

Serialization uses a hash that is calculated from almost everything in a given source file — method name, field name, field type, access modification method, and so on — and serialization compares that hash value to the hash value in the serialized stream.

In order for the Java runtime to believe that the two types are actually the same, the second and subsequent versions of Person must have the same serialized version hash (stored as the private Static Final serialVersionUID field) as the first version.

Therefore, we need the serialVersionUID field, which is calculated by running the JDK serialver command against the original (or V1) version of the Person class.

Once you have Person’s serialVersionUID, not only can you create a PersonV2 object from the serialized data of the original object Person (the new field is set to its default value when a new field appears, most commonly “NULL”), you can also do the reverse: Not surprisingly, the data from PersonV2 is deserialized to get Person.

2. Serialization is not secure

To the surprise and annoyance of Java developers, the serialized binary format is written entirely in documentation and is completely reversible. In fact, simply dumping the contents of the binary serialized stream to the console is enough to see what the class looks like and what it contains.

This has a negative impact on security. For example, when a remote method call is made over RMI, any private field in an object sent over a connection appears almost as clear text in the socket stream, which obviously invites even the simplest security issues.

Fortunately, serialization allows you to “hook” the serialization process and protect (or obfuscate) field data before and after serialization. You can do this by providing a writeObject method on the Serializable object.

Fuzzily serialize data

Assume that the sensitive data in the Person class is the Age field. After all, women don’t talk about age. We can obfuscate the data before serialization, move the digit loop one bit to the left, and then reset it after deserialization. (You can develop more secure algorithms; this one is just an example.)

To “hook” the serialization process, we’ll implement a writeObject method on Person; To “hook” the deserialization process, we will implement a readObject method on the same class. It’s important to get the details of the two methods right — if access to the modified method, parameter, or name is different from what’s in Listing 4, the code will fail unnoticed and the age of Person will be exposed.

Listing 4. Fuzzifying the serialized data

public class Person
    implements java.io.Serializable
{
    public Person(String fn, String ln, int a)
    {
        this.firstName = fn; this.lastName = ln; this.age = a;
    }

    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
    public int getAge() { return age; }
    public Person getSpouse() { return spouse; }

    public void setFirstName(String value) { firstName = value; }
    public void setLastName(String value) { lastName = value; }
    public void setAge(int value) { age = value; }
    public void setSpouse(Person value) { spouse = value; }

    private void writeObject(java.io.ObjectOutputStream stream)
        throws java.io.IOException
    {
        // "Encrypt"/obscure the sensitive data
        age = age << 2;
        stream.defaultWriteObject();
    }

    private void readObject(java.io.ObjectInputStream stream)
        throws java.io.IOException, ClassNotFoundException
    {
        stream.defaultReadObject();

        // "Decrypt"/de-obscure the sensitive data
        age = age << 2;
    }

    public String toString()
    {
        return "[Person: firstName=" + firstName + 
            " lastName=" + lastName +
            " age=" + age +
            " spouse="+ (spouse! =null ? spouse.getFirstName() :"[null]") +
            "]";
    }      

    private String firstName;
    private String lastName;
    private int age;
    private Person spouse;
}
Copy the code

If you need to see obfuscated data, you can always look at serialized data streams/files. Also, because the format is fully documented, the contents of a serialized stream can be read even if the class itself is not accessible.

3. Serialized data can be signed and sealed

The previous tip assumes that you want to obfuscate the serialized data, rather than encrypt it or ensure that it is not modified. Of course, password encryption and signature management can be achieved by using writeObject and readObject, but there is a better way.

If you need to encryption and signature of the entire object, the simplest is put it in a javax.mail crypto. SealedObject and/or Java. Security. The SignedObject wrapper. Both are serializable, so wrapping an object in SealedObject creates a kind of “box” around the original object. A symmetric key is required for decryption, and the key must be managed separately. Similarly, SignedObject can also be used for data validation, and symmetric keys must also be managed separately.

Together, these two objects make it easy to seal and sign serialized data without stressing the details of digital signature verification or encryption. Pretty neat, right?

4. Serialization allows the proxy to be placed in a stream

In many cases, a class contains a core data element from which other fields in the class can be derived or found. In this case, there is no need to serialize the entire object. Fields can be marked transient, but each time a method accesses a field, the class must still explicitly generate code to check that it has been initialized.

If serialization is the primary issue, it is best to specify a flyweight or proxy to put in the stream. Provide the original Person with a writeReplace method that can serialize objects of different types in its place. Similarly, if a readResolve method is found during deserialization, it is called to provide the replacement object to the caller.

Package and unpack agent

The writeReplace and readResolve methods enable the Person class to package all of its data (or the core data in it) into a PersonProxy, place it in a stream, and unpack it at deserialization time.

Listing 5. You complete me and I replace you

class PersonProxy
    implements java.io.Serializable
{
    public PersonProxy(Person orig)
    {
        data = orig.getFirstName() + "," + orig.getLastName() + "," + orig.getAge();
        if(orig.getSpouse() ! = null) { Person spouse = orig.getSpouse(); data = data +"," + spouse.getFirstName() + "," + spouse.getLastName() + ","  
              + spouse.getAge();
        }
    }

    public String data;
    private Object readResolve()
        throws java.io.ObjectStreamException
    {
        String[] pieces = data.split(",");
        Person result = new Person(pieces[0], pieces[1], Integer.parseInt(pieces[2]));
        if (pieces.length > 3)
        {
            result.setSpouse(new Person(pieces[3], pieces[4], Integer.parseInt
              (pieces[5])));
            result.getSpouse().setSpouse(result);
        }
        return result;
    }
}

public class Person
    implements java.io.Serializable
{
    public Person(String fn, String ln, int a)
    {
        this.firstName = fn; this.lastName = ln; this.age = a;
    }

    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
    public int getAge() { return age; }
    public Person getSpouse() { return spouse; }

    private Object writeReplace()
        throws java.io.ObjectStreamException
    {
        return new PersonProxy(this);
    }

    public void setFirstName(String value) { firstName = value; }
    public void setLastName(String value) { lastName = value; }
    public void setAge(int value) { age = value; }
    public void setSpouse(Person value) { spouse = value; }   

    public String toString()
    {
        return "[Person: firstName=" + firstName + 
            " lastName=" + lastName +
            " age=" + age +
            " spouse=" + spouse.getFirstName() +
            "]";
    }    

    private String firstName;
    private String lastName;
    private int age;
    private Person spouse;
}
Copy the code

Note that PersonProxy must keep track of all data for Person. This usually means that the proxy needs to be an internal class of Person in order to access the private fields. Sometimes, the agent also needs to track down other object references and manually serialize them, such as Person’s spouse.

This technique is one of the few that does not require a read/write balance. For example, a version of a class that has been reconstituted to another type can provide a readResolve method to silently convert serialized objects to the new type. Similarly, it can serialize an old class to a new version using the writeReplace method.

5. Trust, but verify

It is ok to assume that the data in the serialized stream is always the same as the data originally written to the stream. But, as a former US President put it, “trust, but verify”. For serialized objects, this means validating the fields to ensure that they still have the correct values after deserialization, “just in case.”

To do this, you implement the ObjectInputValidation interface and override the validateObject() method. If an error is found somewhere when the method is called, an InvalidObjectException is thrown.

conclusion

Java object serialization is more flexible than most Java developers realize, which gives us more opportunities to solve tricky situations.

Fortunately, programming tricks like these are ubiquitous in the JVM. The key is to know them and use them when you get stuck.

Source: www.topthink.com/topic/11361.html

Follow the wechat public account of Java technology stack, stack leader will continue to share Java dry goods tutorial, the public account will be the first time to push, continue to pay attention to. In the public account background reply: Java, get stack length arrangement of more Java tutorial, are actual combat dry goods, the following is only part of the preview.

  • Do you really understand the transient keyword?
  • 1. Synchronized?
  • With Java 11 released, Strings can still play like this!
  • Are Strings in Java really immutable?
  • Five differences between sleep() and wait()