My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Wednesday, January 04, 2012

Improving Function.prototype.bind

There are a couple of things I have never liked that much about Function#bind and this post is about proposing a pattern hopefully better than common one.

At The End Of The Function

A common argument about parentheses around inline invoked functions is that developers can easily recognize them.

// considered ambiguous
var someThing = function () {
return {};
}();

// considered not ambiguous
var someThing = (function () { // note the parenthesis
return {};
}()); // note the parenthesis

// exact same behavior of latest one
// but considered slightly ambiguous
var someThing = (function () { // note the parenthesis
return {};
})(); // note the parenthesis

The parentheses version apparently wins but, specially when no assignment is needed, I believe these are even less ambiguous:

-function(){
alert("OK");
}();

+function(){
alert("OK");
}();

!function(){
alert("OK");
}();

Of course if there is an operator that function will be executed, why on earth anyone would do that otherwise?
Anyway, the problem with all of these human friendly syntax pattern recognition strategies, is that the context of the function, or the whole code behavior, could be polluted/changed via bind or call or apply so we need in any case to scroll 'till the end to better understand what is going on inside the function and what is returned, exactly.

// ES3 example
var something = (function(){
// this could be global object or something else
return this.doSomeStuff();
}.bind(unexpectedObject)); // or call for other cases

// ES5
var something = (function(){"use strict";
// if there is a this reference
// we need to reach the end to understand
// what is it
return this.doSomeStuff();
}.bind(unexpectedObject));

The summary is that bind, used inline, is more like a yoda condition rather than a developer friendly helper so the first point is that bind, as it is, forces the developer to always check the end of the function to know if this has been executed inline or it has been bound to a different context, as well as yoda conditions forces the developer to read "upside down" a statement.
Last, but not least, of course if it's just about bind, parentheses are not helping at all ... still scroll 'till the end

// perfectly valid, not even executed
// but still important to know
// what will be the context for future calls
var something = function(){
return this.doSomeStuff();
}.bind(unexpectedObject);


Team Speaking

Ok, agreed, a function bigger than 10 lines is not that good and should be split and blah blah blah ... point is that sometimes function have to be big since these provide us a way to create private variables and methods ... ok?
Also if that was "the good part", I mean just putting parentheses around, there is some conflict with the fact that functions should not be that long ... I mean, how can we mistake whatever happened in 10 lines?
The truly good part could be done by developers, specially those that work in a team.
The first good rule is to write a bloody single line comment at the very first line, this would just solve everything:

var something = function () {
//! executed inline, returns object

... the rest of the code ...
}();

var somethingElse = function () {
//! bound to variable obj

... the rest of the code ...
}.bind(obj);

var somethingCrazy = function () {
//! bound to obj + executed inline

.. the rest of the code ...
}.bind(obj)();

-function(){
// bound to obj
}.bind(obj)();

Got it? Nobody would ever complain about these little helps ... isn't it? But since we are all lazy devs, the second helper is function declaration rather than function expression.

function impossibruToConfuse(){}

var something = impossibruToConfuse.bind(obj);
var something = impossibruToConfuse();
var something = impossibruToConfuse.call(obj);

With a single line to read I believe the problem is solved ... isn't it? Still ways to improve.

Unable To Retrieve Same Bound Function

The second most annoying thing ever, something that leads inevitably to uncomfortable patterns, is the inability to retrieve an already bound function through the same object.
In my humble opinion, was a freaking design mistake indeed to create a new bound object per each bind call since I believe nobody in this world ever used this "feature".

function task(){}

// Y U NO THE SAME !!!
task.bind(window) === task.bind(window);
// false ....

If the function is the same it means it has same context, inner and outer scope of the other function bound to the same object.
If we bind twice the same object we are doing it wrong 'cause most likely we never meant/need two different bound callbacks/objects for that task ... isn't it?

An Error Prone Logic

The most basic fail of this logic is about listeners, both DOM listeners or self implemented listeners, extremely common in an always more asynchronous driven language as JavaScript is.

generic.addEventListener("stuff", stuff.bind(obj), false);

// how many times have you seen this later on?
generic.removeEventListener("stuff", stuff.bind(obj), false);

Above example won't work as expected ... of course it won't work as expected ... developers expect the same function/object, instead they have a new one.
This approach forces developers to store temporarily the bloody bound function to some private, global, variable or property.

obj._boundStuff = stuff.bind(obj);
generic.addEventListener("stuff", obj._boundStuff, false);

// ... later on ...
if (obj._boundStuff) {
generic.removeEventListener("stuff", obj._boundStuff, false);
}

Remember how lazy we are so that parentheses became one of the good parts of JavaScript since a line of comment was too boring? Now look again at latest example and think how many times you have done something similar ...

A Clever Object.prototype.boundTo

With ES5 we are so lucky to be able to pollute global constructors without influencing for/in and as long as our implementation is not obtrusive and, most important, makes sense.
I came up with this proposal in order to solve at least latest problems described before: a lazy boundTo method

!function (Object) {

// (C) WebReflection - Mit Style License

var // private scope shortcuts
BOUND_TO = "boundTo", // or maybe "asContextOf" ?
defineProperty = Object.defineProperty,
bind = defineProperty.bind || function (self) {
// simple partial shim for "not there yet" ES5 browsers
var callback = this;
return function bound() {
return callback.apply(self, arguments);
};
}
;

defineProperty(
Object.prototype,
BOUND_TO, {
value: function (callback, remove) {
// only the very first time
// two private stacks are created
// and related to the current object
var
cbStack = [],
boundStack = [],
self = this
;
// overwrite the inherited BOUND_TO method
// with the one we actually need
defineProperty(
self,
BOUND_TO, {
value: function boundTo(callback, remove) {
var
i = cbStack.indexOf(callback),
callback = i < 0 ?
boundStack[
i = cbStack.push(callback) - 1
] = bind.call(callback, self)
:
boundStack[i]
;
// falsy values accepted
// except null or undefined
// so it's true by default
if (remove == false) {
cbStack.splice(i, 1);
boundStack.splice(i, 1);
}
// returns bound callback in any case
// handy to remove listeners and clean stacks
// in one single operation
return callback;
}
}
);
// only the first time, invoe the overwritten method
// use directly latter one every other time
return self[BOUND_TO](callback, remove);
}
}
);
}(Object);

How above snippet is supposed to be used? Here a basic example:

generic.addEventListener("stuff", obj.boundTo(stuff), false);

// ... later on ...
generic.removeEventListener("stuff", obj.boundTo(stuff, 0), false);

// bear in mind that ...
obj.boundTo(stuff) === obj.boundTo(stuff);
// always ... unless we release explicitly the bound function/object
obj.boundTo(stuff, false) !== obj.boundTo(stuff);
// since once removed, the new obj.boundTo call will create a new one

Pros

No need to check, no need to store the bound version, memory leaks safe, no need to do anything extra except calling the method through, and this is needed, the same function.
The semantic is straight forward and rather than a yoda condition we have a clear operation that means: return the function created to bind this object as context.
Last, but not least, even if the function was not bound, everything should work as expected, as long as the removeListener generic method has been implemented properly ( checking if the callback/object was attached already )

Cons

Not really, but the function has to be the same. It must be said that two apparently identical functions could be just a copy and paste in two completely different outer scopes. This means that there is no bullet proof way to understand if a function written twice is similar to another one since it is not possible to understand surrounding scopes and it would be silly to do this even on the JS engine level. I am talking about another, unfortunately, common mistake:

// wrong examples ... does not work as expected

generic.addListener("whatever", function () {
return this.whatever;
}.bind(object), false);

// followed by ...
generic.removeListener("whatever", function () {
return this.whatever;
}.bind(object), false);

// or in this proposal case ...

generic.addListener("whatever", object.boundTo(function () {
return this.whatever;
}), false);

// followed by ...
generic.removeListener("whatever", object.boundTo(function () {
return this.whatever;
}), false);

// ... but this is not how JavaScript works ...

Memory speaking, the very first time we use this strategy we create two extra arrays able to speed up operations through indexed function and and bound version per each object. This is inevitable in any case and it's almost exactly the same we do when we address once the bound function.
Once the object has been Garbage Collected, related private arrays will have a reference count equal to zero so the memory should clean up automatically without problems and including all functions bound to that object.

As explained in the example, it is possible in any case to release explicitly bound functions so we can feel like the memory usage is under control.

Why No Arguments

The combination of arguments could be such big number that this technique won't be interesting anymore, specially regarding performances.
It does not really make sense to over complicate such basic, most needed, scenario but if you think this is the biggest impediment, then the name should simply be asContextOf so that no ambiguity would be shared with bind native signature.

Update - Preserving Private Methods

Another reason to chose my proposal is the ability to bind proper private methods without using the current instance to store the bound method anywhere ... follow the es-discuss ML to know more but here the stupid, basic, proof of concept:

// stupid useless example ... just as concept
var Counter = (function () {
// private
function increase() {
this.clicks++;
}

function Counter() {
// none out there should be able to retrieve the bound function
document.addEventListener("click", this.boundTo(increase), false);
};
Counter.prototype.clicks = 0;
Counter.prototype.destroy = function () {
// so that only this class scope can control things and nobody else
document.removeEventListener("click", this.boundTo(increase), false);
};
return Counter;
}());


As Summary

I am pretty sure this simple lazy created proposal would be ideal for many frameworks and projects but if I am missing something or if you have any improvement to suggest, I will be more than happy to listen to you.
Enjoy simplicity over semantic, this is yet another KISS and YAGNI proposal from this blog.

6 comments:

Unknown said...

Typo: A Clever Object.prototype.bondTo

Andrea Giammarchi said...

james.bondTo ... fixed, cheers

Unknown said...

I'm pretty sure you're keeping the callbacks in memory all the time. In order to really avoid keeping them in memory you'd have to use a WeakMap.

Andrea Giammarchi said...

I am pretty sure that would be a GC failure since once the object has been destroyed those arrays have 0 reference counters and should be collected together with all stored callback.

In any case, this is the reason I have implemented a way to explicitly drop already bounded callbacks so no reason to write something incompatible with the web out there as WeakMaps could be via ES6 ;-)

fritz the cat said...

Nothing wrong with Yoda conditions

Andreas Goebel said...

it's an interesting approach and a good article. However, personally I'm torn on this. I guess I'd always prefer bound/stored function declarations for the most part. With your solution there is probably a lot function call overhead plus, a different set of errors/mistakes that can be made (by not having distinct function objects).