Object deserialization handling

2/7/2009

In the last month I rewrote the IDR prototype from scratch and translated the IDR specification document in English. During this process I made a few enhancements in the IDR encoding. I removed an ambiguity with exceptions decoding in some very unlikely situations. The other change was to integrate the update of IEEE 754 specification in 2008 that now defines four types of floating point values, 2 Bytes, 4 Bytes, 8 Bytes and 16 Bytes. It may take some time until these types reach your desk, but IDR should better stick to the standards. So these will be the floating point encodings supported by IDR.

Beside these, there was a much bigger problem left in the API of object deserialization. The problem is to determine what to do when the decoder doesn't recognize the class type of a serialized object. The solution I came up is very satisfying since it matches all the requirements I had. It remains to check its usage convenience with real examples.

The problem

Object deserialization is a process in which the decoder reconstruct the serialized object aggregate. To do so it has to reconstruct each object of the aggregate and restore their pointers to each other. Objects are reconstructed by using object factories, a classic in design pattern. An object factory is an object that "knows" how to reconstruct some types of objects.

The decoder has thus a collection of factories to which it delegates the reconstruction of the different types of objects found in the serialized aggregate. But what happens if the decoder can't find an appropriate factory for some type of serialized object ? In some use case this should be considered as an error, but in others it might be an acceptable and even desirable situation.

Consider for instance exceptions. In IDR an exception is an object and handled as such. There is no point, and even impossible, for a decoder to have a factory for all possible exceptions in the world. It is enough for the decoder to have a factory for the common exception base classes and, of course, the one it has to deal with. It should then be enough to reconstruct the object as an instance of the parent class it has a factory for, a process called object slicing.

The worst case is when the decoder may not even slice the object because none of the parent class is "known" by the decoder. In this case the best the decoder can do is to ignore the object and set all references to it to NULL. We'll call this process object pruning. As for slicing, it may be considered as an acceptable and even desirable behavior with some use cases (i.e optional properties), and an error in others since the lobotomized data structured may end up too crippled or even invalid.

The problem is thus to define the appropriate behaviour of the decoder when a slicing or pruning occurs. In some case it is an error, in others not, and in some case it depends on what part of the aggregate the slicing or pruning took place.

The solution

The decision whether it is an error or not is obviously context specific and have thus to be put in the hands of the user. So the problem boiled down to determine how the user would able to select the appropriate behavior.

The solution I came up was to provide three object deserialization methods.

1. A strict object decoder that would throw an exception and abort object decoding as soon as a missing object factory is detected. With this you get an exact reconstruction or a failure.

2. A lax object decoder that would slice and prune at will and return whatever comes out of it, and nothing else. This object decoder would for instance be used for exceptions.

3. Another lax object decoder, like the previous one, but that would also return a feedback on the missing object factories. The feedback on slicing would be an associative index mapping sliced object references to the list of their unrecognized classs types. The feedback on pruning would be a list of the different types of pruned object with a list of the unrecognized class type and the number of instance pruned.

The later method would make it possible and easy for the user to determine if slicing and pruning occurred, what are the missing factories and test for specific objects if slicing took place and to what extend. Since this method would give an easy way to test if slicing or pruning took place, the strict object decoder may seem unnecessary. The reason of its presence is that it may stop the decoding process as soon as a missing factory is detected and thus avoid wasting resources when an exact reconstruction is required and no feedback is needed.

I'm very satisfied by this solution because it keeps the API simple with only a small effort on the decoder implementation. What I still need to validate is how convenient it is to use.

0 Comments

Object deserialization handling

Leave a Reply.

Author

Categories

Archives