Saturday, February 20, 2010

JavaScript Overload Patterns

Update I have continued with patterns into JavaScript Override Patterns.

We all know JavaScript does not implement a native methods overload concept and what we usually do on daily basis is to emulate somehow this Classic OOP behavior.
There are several ways to do it, all of them with pros and cons, but what is the best way to implement it?

A common situation

Let's imagine we would like to have a method able to accept different arguments types and return something accordingly with what we received.
Usually in Classic OOP an overload cannot redefine the returned type but in JavaScript we can do "whatever we want", trying to be still consistent, so that we can consider us more lucky than C# or Java guys, as more flexible as well.
For this post, the behavior we would like to obtain will be similar to the one we can find in jQuery library, one of the most used, famous, and friendly libraries I know.

// create an instance
var me = new Person();

// set some property
// via chained context
me
.name("Andrea")
.age(31)
;

// retrieve some property
me.name();
// Andrea

While the classic get/set concept is boring, old style, and not optimized, the chainability one is really smart, semantic, easy to understand, and bytes saver.
As example, compare above piece of code with this one, following comments to understand which part is not that convenient:

// create an instance
var me = new Person();

// set some property
me.setName("Andrea");
// unoptimized operation, a setter
// does not usually return anything
// we would expect at least the set parameter
// as if we did
// (me.name = "Andrea")
// we could expect true/false
// as operations flag
// but when do we need a "false"
// able to break our code in any case?
// throws an error if the operation
// was not successful since we cannot
// go on with what we expect

// set something else
me.setAge(31);

// we need to rewrite the var name
// for every single method
// plus we need to write methods with
// everytime a prefix: get or set
// this is redundant and inefficient
me.getName(); // Andrea

If we pass under minifiers and mungers this silly little example, we'll find out that the chain strategy goes from 72 to 56 bytes, while the get/set goes from 76 to 68.
As summary, while overload strategies could be adopted in both cases, the example code will use the first strategy to perform the task: "create a Person class and a method able to behave differently accordingly with received arguments"

Inevitably via "arguments"

As is for every scripting language, the basic strategy to emulate an overload, is the lazy arguments parsing.
Lazy, because we need to understand for each method call what should be the expected behavior.
This is surely a performances problem so even if the topic in this case is overload, we should never forget that sometimes the best way to set or get something, is setting or getting something:

// if we can set something
// the internal property is somehow exposed
// in any case since it is reconfigurable
me.setName("Andrea");
me.setName("WebReflection");
me.getName();

// against
me.name = "Andrea";
me.name = "WebReflection";
me.name;


Avoid pointless overheads


Of course getters and setters could be considered safer, but how many times we have spotted code like this?

// somewhere in some prototype
setStuff: function (_stuff) {
this._stuff = _stuff;
},
getStuff: function () {
return this._stuff;
}

Above code simply adds overhead for each operation performed against the _stuff property, and what's the point to do it in that way?
Unless we don't ensure a type check for each call, above code style over a programming language without type hints could be simply considered an error, since every time we spot a badly implemented design, error is usually the first thought we have.

The Simplest Overload Implementation

Latest code could be already compressed into a single method. We want to perform different operations, accordingly with a generic argument, plus we would like to "kick Java Devs ass" overloading the returned value as well to make chainability possible.

function Person(_name) {
// we would like to have a default name
this._name = _name || "anonymous";
}

Person.prototype = {
constructor: Person,
name: function (_name) {
// what should we do?
// if _name is not undefined or null
if (_name != null) {
// we want to set the name
// (eventually doing checks)
this._name = _name;
// and return the Person instance
return this;
} else {
// nothing to set, it must be a get
return this._name;
}
}
};

A simple test case:

var me = new Person();
alert([

me.name(), // anonymous

me // magic chain!
.name("Andrea")
.name() // Andrea
].join("\n"));


Where Is The Overload

When we talk about overload, we think about multiple methods invoked runtime accordingly with received arguments.
Since as I have said we cannot do this in JavaScript:

// what we are basically doing
// ECMAScript 4th Representation
function name(_name:String):Person {
this._name = _name;
return this;
}
// overload
function name(void):String {
return this._name;
}

// what Java/C# guys should write
// to emulate the current behavior
class Person {
object name(string _name) {
this._name = _name;
return this;
}
// overload
object name() {
return this._name;
}
}

// example
string name = (string)(
(Person)(
new Person()
).name("Andrea")
).name();

all we usually do is to put a lot of if, else if, else, switch statements inside the same method.

Overload Pros

I do believe it's hardly arguable that overloads are able to make code easier to maintain and both more clear and more linear.
While strict typed languages do not need to take care about arguments type, scripting languages are often focused about this matter and the code inside a single method could be messy enough to be hardly maintainable or debuggable.
In few words, with overloads we can split behaviors incrementing focus over single cases delegating arguments parsing, when and if necessary, in the single exposed entry point: the method itself.

Overload Via Polluted Prototype

The first overload implementation to obtain exactly the same behavior is this one:

Person.prototype = {
constructor: Person,

// exposed public method
name: function (_name) {
// exposed entry point
// where we decide which method should be called
return this[_name == null ? "_nameGet" : "_nameSet"](_name);
},

// fake protected methods
_nameSet: function (_name) {
this._name = _name;
return this;
},
_nameGet: function () {
return this._name;
}
};

Pros

We can "cheat" a little bit in the entry point referencing the this variable to decide which "protected" method we need to call.
In my case I don't even mind if _nameGet does not accept an argument, which is still a bad design since the operation is just tricky without a valid reason, but at least we have more flexibility. If we think that every method could have common operations inside, e.g. a User class with a protected this._isValidUser() performed via every other method call, this approach could be considered good enough to split all matters/logics we are interested about.

Cons

Specially for debug, all those faked protected methods could be really annoying. All we want to show/expose are public methods so that a for in loop or a console.log operation, as example, will show just the "public status" of an instance and not every single internal method we are not interested about. Moreover, some "clever" developer could decide to use internals calls but we would like to avoid any kind of critical operation, don't we?

Overload Via Closure

This second approach is able to maintain overload benefits, exposing only what we want:

Person.prototype = {
constructor: Person,
name: (function(){
// method closure
// we can share functions
// or variables if we need it

// the method, named, easier to debug
function name(_name) {
return _name == null ? get.call(this) : set.call(this, _name);
}

// overload
function set(_name) {
this._name = _name;
return this;
}

// overload
function get() {
return this._name;
}

// the exposed method
return name;
}())
};

Pros


A for in loop will pass only the constructor and the public method so that nobody can change, use, or modify its internals. Even other methods cannot access into "name" closure so we are free to use common meaningful name as is as example for both get and set.
If at some point every overload needs to perform the same task, we can simply create another function so that this will be shared in the whole closure.

Cons

Somebody could argue that those internal functions are not Unit Test friendly ... at least this was my guess before I talked with a skilled programmer that said: "dude, you should test what you expose and not what is the exposed internal logic. In other words, it's the name method the one you care about, and not what happens inside which will be implicitly tested via 'name' calls".
Due to the closure, it is not possible to share across the prototype a single function reused for each call.
This is not something truly problematic, since in this case we can always use an outer closure:

var User = (function () {

// private method shared in the whole prototype
function _isValidUser() {
// db operations here
// this is just a silly example
return this._name != null && this._pass != null;
}

// User class
function User() {}

User.prototype = {
constructor: User,
updateAge: function (_age) {
if (!_isValidUser.call(this)) {
throw new Error("Unauthorized User");
}
this._age = _age;
},
verifyCredentials: (function (){
// internal closure
function verifyCredentials(_user, _pass, create) {
if (create === true) {
_create.call(this, _user, _pass);
}
return _isValidUser.call(this);
}

function _create(_user, _pass) {
this._name = _name;
this._pass = _pass;
}

return verifyCredentials;
}())
};

return User;

}());


Optimized Overload via Closure

Since size always matters, and this is valid for performances too, this is an alternative closure example:

Person.prototype = {
constructor: Person,
name: (function(){
function name(_name) {
return _name == null ? get(this) : set(this, _name);
}
function set(self, _name) {
self._name = _name;
return self;
}
function get(self) {
return self._name;
}
return name;
}())
};

Everything else is the same, except we assume that every private/internal method won't require a call/apply, simply the instance as mandatory first argument.

Patterns Size And Performances

All patterns have been passed under YUICompressor with all features enabled. This is the result, in numbers:

  • Simplest Overload (nothing is splitted, everything inside the single method): before 308, after 162, compression ratio: 47%.
  • Polluted Prototype: before 384, after 233, compression ratio: 39%.
  • Overload via Closure: before 464, after 240, compression ratio: 48%.
  • Optimized Overload via Closure: before 464, after 224, compression ratio: 52%.


Simplest Overload


We can easily deduct that the most common approach to behave differently is the classic method with "everything inside". Under the microscope, performances will be slightly better than every other approach since the number of function calls is reduced. But, we should consider that my example does basically nothing and that in most common cases the whole function body is polluted with large amount of variables, sometimes potentially disturbing (e.g. for (var i, ...)) and we can hardly debug long methods with hundreds of checks and if else there. Of course the example I have chosen does not really represent the perfect scenario where it's evident how much cleaner overloads are against common code style, so in that particular case, I would have chosen the first approach, but this is up to us, and "case per case" dependent.

Polluted Prototype

While performances and semantic could appear a bit better, the compression ratio in this case is the worst one. Moreover, this approach implicitly suffers name clashes problem. We should simply think about chained inheritances, and how many times we could have the same name over different mixins or classes.
If we add what we have already understood about Cons, I would define this approach the less convenient one, while it is still probably the most adopted one starting from the classic protected method approach.

Overload via Closure

This approach is already good for its purpose. Not a single compiler can optimize a this reference so I do hope developers will start to get rid of the classic call/apply approach in favor of the self one. THere are no benefits against the polluted prototype, byteswise speaking, but it is still valid all Pros against the latter one.

Optimized Overload via Closure

This pattern is the real winner:

function Person(a){this._name=a||"anonymous"}Person.prototype={constructor:Person,name:(function(){function b(d){return d==null?a(this):c(this,d)}function c(e,d){e._name=d;return e}function a(d){return d._name}return b}())};

As we can see, the number of this references inside the overloaded method is reduced to three, rather than N for each private method call. The compression ratio is best one as is the size, except for the basic case.
Lots of pros, and "just a matter of style" as cons, I already like this latest pattern!

As Summary

We should always analyze different patterns and pros and cons every time we decide to adopt a strategy to emulate something not truly part of the language nature. This is what I have tried to do in this post, hoping some developers will appreciate it, starting to use some suggestion, or giving me more ;)

14 comments:

Julian Jelfs said...

great as usual. So nice to have genuinely useful information in a blog post.

Mariusz Nowak said...

@Andrea
Do you use (or plan to use) getters and setters in your JavaScript (web) development ?
To me, until they're natively supported emulating them is quite overkill.
What do you think ?

Andrea Giammarchi said...

@Mariusz get/set in JavaScript are both powerful and slower than direct access.
The point is that get/set should be used to manifest a variable, and not as default access method.

Think about an Array length, to emulate it, we should use get/set so that get will return the current length, e.g. count(this.items), while set will eventually drop items or add undefined items.

Above behavior is not possible via direct access, but whatever library that will use get/set without thinking at all why they are using it, will be both bigger, size speaking, and slower, without any valid reason.

get/set with internal direct access, are a total nonsense, if there is no check over inputs when we set the property.

I do like get/set, I don't like bad usage of whatever construct we have in the whatever language we use ;)

unscriptable said...
This comment has been removed by a blog administrator.
Mariusz Nowak said...

@Andrea
Yeah I like get/set either and I take full advantage of it when I program in language that natively supports them.
However in JavaScript for the reasons you pointed I never really considered emulating them, anyway interesting read :)

unscriptable said...

Hey Andrea,

Excellent post!

When I first started reading your post, I wondered how you were going to address method overrides in subclasses. However, you didn't mention this.

I spend a significant amount of my programming effort overriding methods of objects in other people's libraries. If the other developer uses a closure to implement the get/set strategy -- and I need to override the set method to add logic or behavior -- I'll have to rewrite the entire set of functions in the closure. For simple get/set pairs, this is fine, IMHO. However, for more sophisticated situations (containing validation logic or side effects), some additional strategies could be used to allow injection of additional logic by subclasses.

I'm typing on a phone right now. Otherwise, I'd code some examples. Anyway, you'd probably do them more efficiently!

Thanks again for an excellent post!

Andrea Giammarchi said...

unscriptable, you are asking next post questions :P
I was already thinking about JavaScript Override Patterns post so please stay tuned here.

Anyway, to quickly answer, if you override, you override the name method so that if you are interested into "set", rather than get, you'll have simply this:

name: (function(){
function set(_name) {
// your code here
}
return function (_name) {
return _name == null ?
this.constructor.prototype.name.call(this) :
set.call(this, _name);
}
}())


You should never be able to modify parent overloads but you can always use the inherited method whenever you need so that you can override only one behavior, recycling others already defined.

Does it answer your question?

RStankov said...

Good post as usual.

I'm just wondering why you check for _name == null to see if _name have been passed. Because if I want to set name to null / 0 / false ... and so on. May be better to check for === undefined. Or even argument.length == 0;

Andrea Giammarchi said...

RStankov undefined is a variable that could be defined everwhere else.

=== undefined is a secure problem, imho.

A name suppose to be a string but in any case null == is false, as is null == false, as is null == whatever.

null is == null and undefined, nothing else.

A null name is not accepted, the concept is to understand if there is a reasonable argument or not.

Same is for length === 0, a waste of chars.
Since C language 0 is considered false, and every other integer is considered true.
Being a length an integer for sure, there is no reason to compare === 0.

Regards

unscriptable said...

Looking forward to your next post then! -- J

Andrea Giammarchi said...

kinda posted right now ;)

JavascriptBank.com said...

very cool & good tip, thank you very much for sharing.

Can I share this post on my JavaScript library?


Awaiting your response. Thank

Anonymous said...

nice post. thanks.

javiani said...

Hi Andrea, sorry my bad english, i'm a Brazilian. Thanks for the excellent post about overload patterns.

I had an idea about Javascript overloading, but I'm not quite sure if it's a good pattern, but I would like to share it anyway.

Something like this:

var Person = {
Class : function(){
//Private
var name = ''
var Interface = {
name : {
0 : function(){ return name },
1 : function(n){ name = n },
2 : function(a, b){ name = a + " " + b }
}
}

//Public
this.name = function(){
return Interface.name[ arguments.length ].apply(this, arguments)
}
}
}

var Edu = new Person.Class

console.log(Edu.name())
Edu.name('Eduardo')
console.log(Edu.name())
Edu.name('Eduardo', 'Ottaviani')
console.log(Edu.name())


What do you think?