Sunday, April 25, 2010

JSON __sleep, __wakeup, serialize and unserialize

The JSON protocol is a de facto standard used in many different environments to transport objects, included arrays and Dates plus primitives such: strings, numbers, and booleans ... so far, so good!
Since this protocol is widely adopted but it has not the power that a well known function as is the PHP serialize one has, we are often forced to remember how data has been stored, what does this data represent, and eventually convert data back if we are dealing with instances rather than native hashes.
If we consider the ExtJS library architecture, we can basically have a snapshot of whatever component simply storing its configuration object. The type will eventually tell us how to retrieve a copy of the original component back and without effort at all.
Since JSON is the preferred protocol to store data in WebStorages, databases, files, etc etc, and since we can save basically only hashes, loosing every other property, I have decided to try this experiment able to bring some magic in the protocol itself.

JSONSerialize

The absolutely alpha version of this experiment has been stored here, in my little repository. If more than one developer, e.g. me, is interested, I may consider to put it in google code or github with a better documentation while what I can do right now, is to show you what JSONSerialize can do for us, step after step.

Normal JSON.stringify Behavior

This is what happens if we use JSON.stringify, or JSON.serialize, when an object is simply .... well, an object.

// let's test in console or web
if (typeof alert === "undefined") alert = print;

// our "class"
function A() {}

// our instance
var a = new A;

// a couple of dynamic properties
a.a = 123;
a.b = 456;

// our JSON call
var json = JSON.serialize(a);
alert(json); // {"a":123,"b":456}

Nothing new, serialize acts exactly as JSON.stringify, with or without extra arguments ... but things become a bit more interesting now ...

The _sleep Method

In the PHP world we all know there are several methods able to bring some magic in our classes and __sleep is one of these methods. Since the double underscore is usually considered a bad practice, due to private core magic functionalities (e.g. __noSuchMethod__, __defineGetter/Setter__, others) I have decided to call it simply _sleep, considering it a sort of magic protected method, whatever it means in JavaScript :-)

A.prototype._sleep = function () {
return ["a"];
};

json = JSON.serialize(a);
alert(json); // {"a":123}

The aim of _sleep is to return an array with zero, one, or more public properties we would like to export. As we can see, JSON.serialize will take care of this operation returning only what is present in the list.
In few words, no needs to send properties we don't need, cool?
Another advantage of a sleep method is to notify, close, disconnect, or do whatever other operation we need to mark the instance as serialized. In other words sleep could be useful every time we need to deal with a variable that may be updated elsewhere using JSON as transport protocol (postMessage, others).

The serializer Property

_sleep is already a good start point but, if we don't know anything else about that hash, how can we understand which kind of instance was the a variable?

alert(JSON.unserialize(json).constructor);
// Object

Too bad, we have been able to define what we would like to export, but no way to understand what we actually exported.
This is where the serializer property becomes handy, letting JSON.serialze behaves differently.

// define the serializer
A.prototype.serializer = "A";

// re-assign the string and check it out
json = JSON.serialize(a);

alert(json); // {"a":123,"_wakeup":"A"}

// what's new? just this:

alert(JSON.unserialize(json).constructor);
// the function A() {} ... oooh yeah!!!

In few words, with or without a _sleep method, we can bring back to their initial status serialized objects ... but why that _wakeup property? Thank's for asking!

The _wakeup Method

As is for PHP, there is a __wakeup method too which is invoked as soon as the string is unserialized! Being this method somehow protected, I thought it was the best one to put into the JSONSerialize logic.

// let's define a _wakeup method
A.prototype._wakeup = function () {
alert(this instanceof A);
// will be true !!!
};

// let's try again
a = JSON.unserialize(json);

// ... oooh yeah!

As the PHP page shows in some example, the _wakeup function can become really useful when we are saving a database connection or a WebStorage wrapper or whatever cannot be persistent and requires to be initialized so .... at least we can save some info rather than ask them every time, isn't it?
The moment _wakeup will be invoked the instance will already have every exported property assigned, exactly as is for PHP ... wanna something more?

serialize And unserialize Methods

The PHP Serializable interface brings some other magic via the SPL: we decide what we want to export and we receive it back when unserialize is invoked. Same is for JSONSerialize, with higher priority over _sleep and _wakeup but not better performances (right now, I may consider to avoid some extra operation to follow current PHP status tho ...)

// introduce the serialize method
A.prototype.serialize = function () {
return JSON.stringify({c:789});
};

// try this at home
json = JSON.serialize(a);
alert(json); // {"c":789,"_wakeup":"A"}

// this will call the _wakeup since
// unserialize has not been defined yet
// please note the a property won't be there anymore
// cause serialize returned a c instead
a = JSON.unserialize(json);

// let's have priority via unserialize
A.prototype.unserialize = function (data) {
this.b = JSON.parse(data).c;
};

// let's try again, _wakeup won't be invoked
a = JSON.unserialize(json);

// a won't be there, but b will
alert([a.a, a.b]); // undefined,789

And that's all folks!

JSONSerialize Pros

The serializer property accepts namespaces and does not use evaluation. The whole little script does not use evaluation at all and I think this is good, specially for security reason.
Nested objects and arrays are supported as well which means that we can serialize complex hierarchies and have them back without effort and already initialized.

JSONSerialize Cons

Nested means loops, as is for the JavaScript JSON implementation performances are surely slower than a native implementation. At the same time we should consider when we need it, cause our own implementation to bring instances back could cost more. Finally, if we all like this, we may push some browser vendor for a core implementation, no?
Another performance problem is with serialize and unserialize since these requires double parsing in order to respect the behavior.

As Summary

It's not the first time I am trying to enhance the JSON protocol to fit my requirements and this is probably the less obtrusive and secure way I could came out with. I hope somebody will appreciate at least the idea and I am up for all your thoughts ;-)

2 comments:

Àl said...

Hi Andrea, I'm starting to think that the level of your articles is too high for every other JS developer to follow. ;-)

Anonymous said...

Bravo, what words..., a brilliant idea