Thursday, October 15, 2009

DOM Node Proxy

This is just a quick post from home sweet home.
A common DOM related problem is to create an association between a node and a generic object. The most dirty, memory leaks prone, and obtrusive way to perform this task is this one:

document.body.obj = {
prop:"value",
otherProp:function(){}
};

Above snippet is a bad practice for different reasons.
  1. obtrusive, it's assuming that no other libraries will use "obj" property name to perform an analogue task
  2. dirty, if we associate a primitive value Internet Explorer will expose it in the node string representation
  3. memory leaks, if the object points something "live", another node, or a HTMLCollection, the generic node will never be collected by the garbage


Alternatives

Specially to avoid last problem, the memory consumption, it's a good practice to store an index, rather than an object. To make things less obtrusive and get rid of conflicts, we usually create a "unique id".

// the array with all objects
var stack = [];

// the unobtrusive property name
var expando = "prefix" + new Date().getTime();

// the obejct to relate
var o = {};

stack.push(o);

// the relation via index (last object)
document.body[expando] = stack.length - 1;

As I have already linked and explained, this technique is still dirty because Internet Explorer will show the unique id via outerHTML or generic node representation.

Strategies

jQuery, and many others, create an association for each manipulated dom node. This could consume RAM without a reason since there could be a lot of nodes with an associated object that will never be used.
Next version of jQuery, right now in alpha stage, understood this point changing the object association logic. I have not read how yet, but I would like to write something I've been used for a while, a sort of proxy object created for DOM nodes and object relations.

DOM Node Proxy

var proxy = (function(){
// another (C) WebReflection silly idea
var expando = "@".concat(+new Date, Math.random()),
stack = []
;
return function proxy(){
return stack[this[expando]] || stack[
this[expando] = new Number(stack.push({}) - 1)
];
};
})();

Above snippet uses almost all strategies I know to avoid obtrusive property, dirty layout, and direct object assignment (index strategy).
To better understand what exactly is above function I have commented each part of it:

var proxy = (function(){

// another (C) WebReflection silly idea

// one function to associate as proxy
// Being assignment a direct one
// standard browsers won't modify
// attributes while being proxy
// an object (not primitive value)
// IE won't expose it in node string
// representation (e.g. outerHTML)
function proxy(){

// a proxy call costs only once:
// the first time it's called
// Other calls will return the object
// This is to avoid objects association
// even if these are not necessary
return stack[this[expando]] || stack[

// the index is the last one in the
// private stack Array. To avoid
// leaks we don't associate directly
// an object but simply an integer.
// If we directly associate
// a primitive value, IE will expose
// is in the dom string representation
// (e.g. outerHTML)
// To avoid this we can just assign
// a Number instance, rather than
// a primitive "number"
this[expando] = new Number(

// push returns the new length
// we need last inserted object
// index to relate the object
stack.push({}) - 1
)
];
};

var
// private unique expando with
// an invalid char as prefix
// in order to make attr name
// easy to recognize in a possible
// IE attributes loop
expando = "@".concat(
+new Date,
Math.random()
),

// list of associated object
stack = []
;

// ready to go!
return proxy;

})();

Is it clear enough? This is a simple usage example (please reado NOTEs):

onload = function(){

// associate a proxy
// NOTE: this is still obtrusive
// the property name should be a unique id
// or it should have library prefix
// otherwise we could have conflicts
document.body.proxy = proxy;
// aka: node[expando] = proxy;

// retrieve the proxy object
var p = document.body.proxy();

// test proxy: true
alert(p === document.body.proxy());

// test clean body string representation
alert(document.documentElement.innerHTML);

// find proxy created property
for(var k in document.body){
if(k.charAt(0) === "@")
alert([k, document.body[k]])
;
}
};


As Summary

This is more a proof of concept but I hope showed code will help us to replicate the behavior. The main missed part is the internal stack management: how can I clean the stack index when I don't need the node anymore? All we need is an extra in-proxy-scope function or a specific associated instance rather than a raw object.
In ew words there are no best strategies for this second problem, it just depends what we need.
From a logical point of view, if we give indirect access to that stack, exposing its length or via functions able to modify it, stack safety could be compromised. What I could suggest is something like:
var proxy = (function(){
// (C) WebReflection - Mit Style License
function proxy(){
return stack[this[expando]] || stack[
this[expando] = new Number(stack.push(new $proxy) - 1)
];
};
function $proxy(){
this._index = stack.length;
};
$proxy.prototype.destroy = function destroy(){
delete stack[this._index];
};
var expando = "@".concat(+new Date, Math.random()),
stack = []
;
return proxy;
})();
where the stack is manipulated indirectly while nothing is publicly exposed.

13 comments:

Mariusz Nowak said...

Interesting approach.

One thing - once in some benchmarks (I don't remember now where it was) I read that creating objects can be slow in mozilla, and you are creating new object for every push to stack. Maybe it would be cleaner to use same object for each push (?)

Andrea Giammarchi said...

Mariusz Nowak I think you missed the whole point about DOM and Object relations ... if it is always the same object you don't need anything, one DOM, one object, now how can be this convenient, just use a global object, no? The relation between a DOM node and an Object is to attach properties, methods, info, without affecting directly the DOM node. This relation must be unique, as every library does so far, got it? :-)

The test you are talking about does not make sense, in JavaScript almost everything is an object, the problem is if you create an object for each parsed DOM node, as jQuery and other do right now.

This post is about a strategy where the relation and the object is created only once and only if necessary, rather than always.

Regards

Daniel Steigerwald said...

Hi, using boxed primitives is interesting idea, but I decided to wrap index into array.
Here is my final solution to get unique ID of almost anything.
http://gist.github.com/210963

Andrea Giammarchi said...

Hi Daniel, interesting solution but all I need is ONE unique id, used as property name, and nothing else :-)

Nicolas said...

Wouldn't it be better to let the proxy function get the element as formal parameter instead of referring to *this*?

I'd still work and you wouldn't be appending/creating two properties in your DOMElement (e.g the 'proxy' property and the '@stuff' property) whenever you want to use the proxy method on an element.

Andrea Giammarchi said...

NIcolas, there is only one property, and no arguments o send ... so the answer is no :-)

Nicolas said...

Hi Andrea,

I'm sorry but I still don't understand.

In your 'onload' example you first assign the proxy method to the proxy property of the body element:

document.body.proxy = proxy;

That makes one property added.

After that you're calling the proxy method, which adds the '@stuff' property to the 'body' DOMElement making two properties added to the element.

document.body.proxy(); //Adds a new '@something' property to body

That makes two properties added to the domElement: the 'proxy' property which contains a method and the '@stuff' property which contains an integer.

Of course, unless I'm missing something.

What I was just saying was that you could just make the proxy signature proxy(elem) to take an element and use that 'global' function with any DOMElement you like without adding a 'proxy' attribute to your DOMElements each time you want to proxy an element.

Daniel Steigerwald said...

Isn't nice idea to have ability to have unique id of everything, not only elements? Once you realize it, you will like my $uid function :)

kangax said...

In Prototype.js we use array with 1 item — id string — to work around "serialized primitive" issue. I wonder if Number object (or String object with that one index value) is lighter (memory-wise) than Array object that we currently use :)

Andrea Giammarchi said...

@Nicolas the NOTE is about that assignment, that should use a unique id but you are right, there are two properties but since these properties do not make the node dirty in string representation and both will be unique, that is the reason the id is considered unobtrusive.
I don't know any valid reason to perform a "for in" over a DOM node since this is not an object and what's exposed is not standard at all, at least the behavior, plus the node type so I can't spot any side concrete effect. Apologize for misunderstanding.

@Daniel, unique id is generally something created in aprivate scope and hopefully not shared between libraries or we could fall into conflicts.

@kangax, if you guys has to retrieve the property every time via Array access I bet a Number instance will be faster but no concrete tests yet. Anyway, the usage of new Number is something I've never spot in any library so it worthy ... why not :-)

Daniel Steigerwald said...

@andrea - why you said that? or maybe what you were trying to say? You afraid of uniqueness of such generated id or what?

Andrea Giammarchi said...

@Daniel to get unique ID of almost anything libraries use just a unique id and not a shared one via an object, that's what I've said :-)

medikoo.com said...

@Andrea sorry, I got the idea (I use it in my toolkit for some time) .. however I didn't read your code clearly and at first thought that you just push index with that object
..sorry for that not well thought comment :)

Regards