My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Saturday, February 28, 2009

On JavaScript Inheritance Performance - One Step Back

Few days ago I wrote a post about this argument, proposing an alternative "best option" way to use an injected parent in each method.

As soon as more developers read my post, more sparkles came out from the fire: true classical inheritance simulation via missed methods in the chain, exception during methods execution that could trap the temporary injected parent, and other interesting stuff again.

At the end of all these tests, benchmark, and libraries evaluation, I decided to step backward about my proposal, making things simple, logic, and extremely fast (as much as possible).

My Conclusions


  1. if we inject a parent, we have to change and/or wrap the original method, adding noise in the execution and inevitably more operations to perform (read: less performances)

  2. for each inherited metod, even if it is the same of the parent one, we need to add functions and code which means more RAM and less portability

  3. more we over-mess the simple JavaScript inheritance stack, less control we'll have for debugging and maintenance

  4. using strategies like the one adopted from the YUI library could only add confusion because if we put the instance in a parent method, the super one or the super.super.super (and go on), its parent will be still the original one and if we call this.superclass in the wrong context, results will be always unpredictable for that method (unless we did not write down every specific case, if the instanceof A called that method ... if the instanceof B called that method, etc etc)


Accordingly, the fact we are all lazy developers should not mean we can loose performances, use more RAM, make things more complicated than we need, specially in this Web era where devices with limitations could use our code without problems (iPhone) but only if performances are reasonable and downloaded code size is, again, reasonable.


The new best option to use a parent method

Following the test proposed in the old post, this is how I can obtain best performances even against classical way to use parent methods:

function WRSub(name, age){
// classical and debug prone parent call (fastest)
WRClass.call(this, name);
this.age = age;
};
wr.extend(WRSub, WRClass, {
// WebReflection parent method call in closure
// faster than runtime method resolution:
// WRClass.prototype.toString.call(this)
toString:(function(toString){
return function(){
return toString.call(this) + ": " + this.age;
}
})(WRClass.prototype.toString)
});



Summary

I love performances, readable code (big lol from kangax? :P), code portability, logic execution, and all we need is already there in the language.
We do not need to over mess the inheritance chain, the only thing we need is a god way to extend able to understand if the browser is missing hidden methods (toString, etc) and nothing else.
Once we have this good way to extend a constructor, everything else will be fast, readable, debug prone, and compatible with every possible situation, without breaking our neck to understand why that method in that case behaved differently, or that parent did not change as expected, etc etc ... robust, reliable, and fast code? Forget all these parent/$super/superclass implementations, put what you need in a closure to simplify and speed up the code and that's it, as my new benchmark shows where my clean and simple implementation bites every other library, even the classical full namespace way.


News about WebReflection Library

I have added a couple of things which are truly common on daily basis development. You can find a list of all JS 1.6 Array methods to use with whatever list you want, DOM Collections included.

wr.Array.forEach.call(document.getElementsByTagName("div"), function(div, i, collection){
div.innerHTML = "I am div " + (i + 1);
});

Those are compatible with ECMAScript specs, native performances for updated browsers, best performances for IE and others.
The wr.Object.forIn is the classical for in loop plus hidden methods in IE, something we can use to extend constructors or to inspect objects.
I am considering to put the hasOwnProperty check by default to loop only properties assigned, and not those inherited, but for extending purpose, we need to loop everything.
setInterval and setTimeout are natives or wrapped to support extra arguments as well.
wr.id is the classic method to generate an expando or a unique id for each call while wr.is is a function with all native checks via Object.prototype.toString.

The purpose of the library is not (yet) to substitute other libraries, it is just about basic things we always need, focused on performances, compatibility, etc.
I'll add more, but so far enjoy the extend, now free of bugs since different, the forIn, the is, the id, and current Array implementations.

Tuesday, February 24, 2009

[OT] ... if Gmail is down I can't ... (aka: new era paranoia)


  1. receive emails from my PC or my Android

  2. send emails from my PC or my Android

  3. follow developers via mailing list via PC or Android

  4. chat via GTalk with buddies (mainly developers)


Bloody Hell I feel outside the world!!!

P.S. ... but fortunately in few minutes everything went back :D

Saturday, February 21, 2009

Yahtzee! On Line to print or simply play

You know ... sometimes in 2009 a dice based game could be an evening resolution or simply nice time to spend with somebody else.
I was out of paper and after the second try to draw manually the Yahtzee table I though: Why on earth I do not create a table to play with?

Here I am: Yahtzee! on line
You can print in landscape and reuse the paper 4 times (two tables in the front, two in the back) or to be even more green, you can simply put your green laptop in the corner of the table and play with the interactive table.
I usually play with F11 option (full screen browser), but I am not sure the automatic score is a good idea (more interesting to count manually who is winning, isn't it?)

Have fun :)

On My Vicious JavaScript Code

Disclaimer

This post is a reply to kangax question:I’m sorry but why such vicious coding style?
I'd like to explain my point of view about JavaScript code style so, if interested, please read until the end, thank you :)



Some Background

I study and use JavaScript since year 2000 and one of the most interesting thing about this language is that even if it is "that old" you never stop to learn something new (techniques, strategies, adapted patterns, etc).
At the same time, the "biggest trouble" of this language is its weight in a page, which is the reason we all use minified and gzipped or packed versions of our application. Coming from a "33Kb modem internet era" I always tried to pay attention about the size and I studied and created both a JavaScript minifier and a compressor.



Compressors Rules

Unless we are not using an "incremental" compressor, as GIF is, we should consider that packer algorithm, as gzip and deflate, are dictionary based compressors (Huffman coding).
These techniques to minimize our scripts size give us best compression ratio if we use a reduced dictionary. In few words, if we have these two piece of codes:

// code A
function tS(obj){
return obj.toString();
};

// code B
function toString(obj){
return obj.toString();
};

even if the length of the code A is smaller, it will use 1 or more bytes than code B once compressed.
The reason is really simple. Try to imagine that both codes are converted in this way:

// code A
0 1(2){3 2.4();};[function,tS,obj,return,toString]

// code B
0 1(2){3 2.1();};[function,toString,obj,return]

Code B has a reduced dictionary, and reduced dictionary means best ratio. That's why when I introduced years ago the little bytefx library I defined it "packer friendly" (variables name are similar in the entire code).



How Compressors Changed My Way To Code in JavaScript

Accordingly, and probably against every common programming naming convention practice, those studies let me think that as long as a variable name is meaningful enough, it does not matter if it is perfectly clear to recognize or not, since in every case when we want to understand a code, we need to read it from the beginning to the end. Here there are a couple of example:

// (pseudo)compilable language practices
var splitted = "a.b.c".split(".");
function argsToArray(args){
return Array.prototype.slice.call(args);
};

// my code style
var split = "a.b.c".split(".");
function slice(arguments){
return Array.prototype.slice.call(arguments);
};

The split variable is meaningful enough, it is something that has been splitted via split method, and the same is for the slice function: I use it to call the slice Array method over a generic ArrayLike object, as the Object arguments is.



Confusion, Ambiguity, WTF About arguments? Not Really!


The reason I chose inside the slice function the variable name arguments is simple: in that function I do not need its own ArrayObject arguments, while I will use that function mainly with an ArrayObject arguments variable from other functions.

function doStuff(){
var Array = slice(arguments); // other stuff
};

How can be then that arguments variable considered ambiguous?
Moreover, that function "will cost zero once compressed", because in every code I've seen so far the generic function arguments variable, plus the slice method from Array's prototype, are always part of the rest of the code. So, in terms of bytes, I have added a useful function in only 8 extra bytes: (){();};



OMG! I even called a variable Array inside the doStuff function!!!


Yes I did indeed, and the reason is still the same: almost every library or piece of code use the Array keyword but if I do not need the global Array constructor in that scope, why should I leave such meaningful dictionary word untouched?
Moreover, the Array global constructor is used mainly because of its prototype, but once I have an Array variable called Array I can simply use directly its methods instead of the global one, considering those are exactly the same.

function doStuff(){
var Array = slice(arguments);
Array.push.apply(this, Array);
return this
};

Finally, the day we will need the native global constructor, String, Object, Array, Function, we can always use the safe window.Array or, if we decided to name a variable window inside a scope, the self.Array or the this.Array or the in line global scope resolution call:

var toString = function(){return this.Object.prototype.toString}();




Is It Just About Naming Convention?

I use different strategies to be sure my code will be the smallest possible. As example, and as showed above on toString assignment, if I need a runtime called function which returns a value and it is not that big function, I do not use parenthesis around.
If an assignment or a return is before an end bracket "}" I do nt usually put the semi colon at the end. For the interpreter, it does not matter, because it transform in its own way the function.

alert(function(){return this.Object.prototype.toString});

// produced output
function(){
return this.Object.prototype.toString;
}
// indentation, plus semicolons where necessary

The rest is usually managed by the minifier, so the fact that a return"a" does not need a space between return and the String "a" is superfluous if we use, for example, a clever minifier as the YUI Compressor is. The same is for loop with a single line to execute, I do not use brackets (I love Python indentation style).
for(var key in obj)
if(typeof obj[key] == "string")
obj[key] = parse(obj[key]);




My Style Cons


  • tools like JSLint often fail to recognize my code as valid so I do not use JSLint (... but fortunately I do not need it, I now my code is beautiful :P)

  • in a Team we should all respect Team code conventions, so my code style could not be the best choice if other programmers do not know JavaScript that much and/or use a specific/different code style

  • if the development product will not use minifier + compressors my code style could produce bigger size instead of smaller (Object instead of obj or o)

  • my code could require a better lecture than a quick one, but this is true for every library if we would like to understand properly a trick, the entire scope, an assignment, etc, etc ...





My Style Pros


  • final size, minified and packed/gzipped, is most of the case reduced if compared with other "code scripters"

  • more big is the library I would like to extend, less my implementation costs, thanks to a wide range of words to use (common like key, Object, arguments, Function, etc plus others)

  • variable names are often more meaningful than other and instead of function(str, callback, obj){} I bet we cannot understand variables type with a function like function(String, Function, Object){}. So, if possible and when necessary, my code style is closer to strict programming language than every other, allowing me to save time writing directly String as variable name, instead of /* String */ str





Conclusion

I know for somebody my vicious style could be considered a non-sense or something too "extreme" or hard to maintain. But, on the other hand, I am not developing public libraries (I prefer to extend them rather than reinvent the wheel if it is not necesary at all) and my implementations, for me, are much more easy to understand/maintain than others since I often perform linear (sometime tricky) operations which make absolutely sense to me and in a single line when it is possible. You like it? You don't? It does not matter, at least now you know how much pervert a programmer mind could be for an "out of the common rules language" as JavaScript is.
Enjoy my craziness :D

Update
Since telling you I have studied these cases was not enough, here is a truly simple demonstration for the Anonymous user that left the comment:

<?php
$output = array();
for($i = 0; $i < 80; $i++)
$output[] = 'testMe';
$str = implode('.', $output);
echo 'WebReflection: '.strlen(gzcompress($str, 9)); // 21
echo '
';
$output = array();
for($i = 0; $i < 80; $i++)
$output[] = 'test'.str_pad($i + 1, 2, '0', STR_PAD_LEFT);
$str = implode('.', $output);
echo 'Anonymous: '.strlen(gzcompress($str, 9)); // 144
?>

21 bytes against 144 so please, if you think my point is wrong, demonstrate it :)

Thursday, February 19, 2009

On JavaScript Inheritance Performance and Libraries Troubles

Update: I have replied to myself and developers in a new post. I would like to say a big Thank You to every developer exposed tests, benchmark, traps, and considerations. Please read my last thoughts about the subject, since I deeply reconsidered my position.
Update: Please do not get me wrong. I have no intention to say that one or any of cited framework, piece of code, library, is not good or fast enough to extend other classes. This post is about maniac optimization based on personal considerations over some deeper analysis to complete in a way the first benchmark.
I am not criticizing libraries, they are great and they offer everything, I am simply showing a specific case, which is particular on purpose, and a specific behavior that, useful or not, could not be respected.


About


Few days ago Ajaxian published a post about JS inherited methods performances via common libraries strategies.
I think the argument is extremely interesting but there's not enough material yet, so here I am with my contribution plus a "best option" proposal.

Why We Need Libraries to Extend Functions


  • libraries (should) manage properly the prototypal chain

  • libraries (should) make code more elegant/readable

  • libraries (should) let us create overrides being able to call parent/super method implicitly if necessary

  • libraries are often based over a generic extended "class" or hierarchy and users should be familiar with that way to extend or create classes in order to avoid internal conflicts



Why We Do Not Need Libraries to Extend Functions


  • we are performances or bytes maniacs and we are scared by library obtrusive implementations

  • we don't trust the library pattern/strategy to extend function - we know a better way to do it simply and quickly

  • we would like to create hybrids unmanageable via used library


About above points, let me answer to this question:
"Which Common Code/Library is Good and Fast to Extend Properly?" Considering the YUI approach out of the game, NOT A SINGLE ONE!!!

Tested Libraries in Alphabetic Order

base2, dojo, jClass, MooTools, Prototype, WebReflection (blog proposal), and YUI


About the Test

I tried to create a particular inheritance case.
There is a parent class with a toString and a toLocaleString prototype method.
These methods are particular since they are hidden methods (read: native) and browsers like Internet Explorer do not discover hidden methods even if those have been explicitly assigned.
The extended class should be able to understand the inheritance without problems and produce the expected result, showed in the Ad Hoc example.


Ad Hoc concept and troubles


  • Concept: manual inheritance over usable constructors.

  • Troubles: requires a bit of skill and parent calls are always explicit.



// manual implementation
// best performances and expected behavior in every browser
function AdHocClass(name){
this.name = name;
};
AdHocClass.prototype.toString = function(){
return this.name;
};
AdHocClass.prototype.toLocaleString = function(){
return "locale: " + this.toString();
};
function AdHocSub(name, age){
AdHocClass.call(this, name);
this.age = age;
};
AdHocSub.prototype = new AdHocClass;
AdHocSub.prototype.constructor = AdHocSub;
AdHocSub.prototype.toString = function(){
return AdHocClass.prototype.toString.call(this) + ": " + this.age;
};

// This is the expected result We would like to obtain
var me = new AdHocSub("Andrea", 30);
alert(me); // Andrea: 30
alert(me.toLocaleString()); // locale: Andrea: 30



base2


  • Concept: runtime injected parent (called via base) over explicit initialization via init method

  • Troubles: every instance calls two functions when initialized: the contructor plus the init method. This slows down performances. At the same time, not every method is inherited properly.



var BaseClass = base2.Base.extend({
constructor:function(name){
this.name = name
},
toString:function(){
return this.name;
},
toLocaleString:function(){return "locale: " + this.toString()}
});
var BaseSub = BaseClass.extend({
constructor:function(name, age){
this.base(name);
this.age = age;
},
toString:function(){
return this.base() + ": " + this.age;
}
});
var me = new BaseSub("Andrea", 30);
alert(me); // Andrea: 30
alert(me.toLocaleString()); // ERROR: same as toString both IE and Others



dojo


  • Concept: lazy dependencies resolution via the initializer method

  • Troubles: the initializer method is not that linear and it is probably the slowest in the list. Sometimes we need to call explicitly the method, as first argument, while sometimes we need to modify the arguments object or create an array to call the parent method. I am not sure I used dojo best practices to replicate the scenario but that's what I found googling for a while (Update: thanks to Eugene Lazutkin I could use a better practice which improved consistently performances). Is the rest fine? Not really, Internet Explorer does not inherit some hidden method. Update: apparently dojo offers different possibilities to extend and "play" with parents, so this is just one of them.


dojo.declare("dojoClass", null, {
constructor:function(name){
this.name = name;
},
toString:function(){
return this.name;
},
toLocaleString:function(){return "locale: " + this.toString()}
});
dojo.declare("dojoSub", dojoClass, {
constructor:function(name, age){
this.age = age;
},
toString:function($super){
return this.inherited("toString", arguments) + ": " + this.age;
}
});
var me = new dojoSub("Andrea", 30);
alert(me); // Andrea: 30 - not in IE
alert(me.toLocaleString()); // ERROR: same as toString in IE



jClass


  • Concept: basically the same one used by base2

  • Troubles: even worse than base2 since there are no explicit searches for toString and valueOf (most common hidden methods)



var ResigClass = jClass.extend({
init:function(name){
this.name = name;
},
toString:function(){
return this.name;
},
toLocaleString:function(){return "locale: " + this.toString()}
});
var ResigSub = ResigClass.extend({
init:function(name, age){
this._super(name);
this.age = age;
},
toString:function(){
return this._super() + ": " + this.age;
}
});
var me = new ResigSub("Andrea", 30);
alert(me); // Error: [object Object] in IE
alert(me.toLocaleString()); // Error: [object Object] in IE



MooTools


  • Concept: runtime injected parent, similar to base2. In my opinion the most clean and elegant way to extend a class.

  • Troubles: Classes are created via Class constructor so we have same bottleneck spotted in base2. Inheritance problems are basically the same found in base2: some hidden method is not inherited.



var MooToolsClass = new Class({
initialize: function(name){
this.name = name;
},
toString:function(){
return this.name;
},
toLocaleString:function(){return "locale: " + this.toString()}
});
var MooToolsSub = new Class({
Extends:MooToolsClass,
initialize:function(name, age){
this.parent(name);
this.age = age;
},
toString:function($super){
return this.parent() + ": " + this.age;
}
});
var me = new MooToolsSub("Andrea", 30);
alert(me); // Andrea: 30 - not inherited in IE
alert(me.toLocaleString()); // ERROR: locale: Andrea in IE



Prototype


  • Concept: $super variable as parent sent as first argument, imho the most obtrusive way so far and not elegant/intuitive at all.

  • Troubles: something works as expected, but only toString and valueOf, as is for base2, are considered hidden methods. The extra argument and the method overload make this library almost slow as dojo inheritance implementation is.



var ProtoClass = Class.create({
initialize:function(name){
this.name = name;
},
toString:function(){
return this.name;
},
toLocaleString:function(){return "locale: " + this.toString()}
});
var ProtoSub = Class.create(ProtoClass, {
initialize:function($super, name, age){
$super(name);
this.age = age;
},
toString:function($super){
return $super() + ": " + this.age;
}
});
var me = new ProtoSub("Andrea", 30);
alert(me); // Andrea: 30
alert(me.toLocaleString()); // ERROR: same as toString in IE




YUI


  • Concept: a constructor link to its parent prototype via "this.constructor.superclass". No runtime operations, no injected overloads, just "raw methods" based on explicit calls.

  • Troubles: not really. The YUI approach is extremely simple and efficient then "of course it works"!. There are just a couple of steps resolved via YAHOO.lang.extend: the correct constructor reassignment after the prototype one, plus the superclass link. I have just a comment about YUI implementation: guys, why do not you avoid that function creation for each extend call?



// No var F = function(){}, i SUGGESTION
extend: function(F){
return function(subc, superc, overrides) {
if (!superc||!subc) {
throw new Error("extend failed, please check that " +
"all dependencies are included.");
}
F.prototype=superc.prototype;
subc.prototype=new F;
subc.prototype.constructor=subc;
subc.superclass=superc.prototype;
if (superc.prototype.constructor == OP.constructor) {
superc.prototype.constructor=superc;
}

if (overrides) {
for (var i in overrides) {
if (L.hasOwnProperty(overrides, i)) {
subc.prototype[i]=overrides[i];
}
}

L._IEEnumFix(subc.prototype, overrides);
}

}
}(function(){}),

And here we are with the scenario:

function YUIClass(name){
this.name = name;
};
YUIClass.prototype.toString = function(){
return this.name;
};
YUIClass.prototype.toLocaleString = function(){
return "locale: " + this.toString();
};
function YUISub(name, age){
this.constructor.superclass.constructor.call(this, name);
this.age = age;
};
YAHOO.lang.extend(YUISub, YUIClass);
YUISub.prototype.toString = function(){
return this.constructor.superclass.toString.call(this) + ": " + this.age;
};
var me = new YUISub("Andrea", 30);
alert(me); // Andrea: 30
alert(me.toLocaleString()); // locale: Andrea: 30



WebReflection Proposal

After an analysis like this one, how could I skip my "all the best from others without troubles" proposal?
These are my considerations:

  • The fastest way to use a parent method is an explicit call

  • The best way to perform above step is via a link able to remove explicit dependencies (read: MyParentClassName instead of this.constructor.superclass). In this way methods could be easily transported from a prototype to another one without problems (less memory usage and less code to maintain)

  • The this.parent() solution is the cleanest one and generally speaking more close to our concept of "Object Oriented JavaScript". It is portable, it is meaningful, it is shorter than an explicit call with or without a link but it is not that fast to execute.

  • Using best practices to cache all we need, create a quick wrapper, resolving hidden methods problems, will not bring us to native or YUI performances, but at least in a good scenario: fast enough to guarantee performances over code elegance and readability


So here we are with my proposal:

  • Concept: parent in prototype but changed runtime for hierarchy purpose only if necessary (read: only for overrides). This will allows us to call parent directly in the constructor, as example, so we won't have double calls for each instance. At the same time, this implementation lets us call explicitly a parent method from those whose were not inherited.

  • Troubles: performances are the best but still far from native one.



function WRClass(name){
this.name = name;
};
WRClass.prototype.toString = function(){
return this.name;
};
WRClass.prototype.toLocaleString = function(){
return "locale: " + this.toString();
};
function WRSub(name, age){
this.parent(name);
this.age = age;
};
wr.extend(WRSub, WRClass, {
toString:function(){
return this.parent() + ": " + this.age;
}
});
var me = new WRSub("Andrea", 30);
alert(me); // Andrea: 30
alert(me.toLocaleString()); // locale: Andrea: 30

// last, but not least
WRSub.prototype.explicitParentToString = function(){
return this.parent.prototype.toString.call(this);
// instead of
return this.constructor.superclass.toString.call(this);
};
alert(me.explicitParentToString()); // Andrea



WebReflection proposed Extend

For future improvements/bugs fixes, please use this link.

var wr = {
extend:function(parent, extend){
/**
* (C) Andrea Giamamrchi
* Mit Style License
*/
return function(self, Function, Object){
var prototype = function(){
this.prototype[key] =
this.prototype.parent && typeof Object[key] == "function" && typeof this.prototype[key] == "function" ?
extend(Function, this.prototype[key], Object[key]) :
Object[key]
;
};
if(Object){
parent.prototype = Function.prototype;
self.prototype = new parent;
self.prototype.constructor = self;
self.prototype.parent = Function.prototype.parent ?
extend(Function, Function.prototype.parent, Function) : Function
} else
Object = Function;
for(var key in Object)
prototype.call(self);
for(key in {toString:key})
return self;
//* ... for Internet Explorer only ...
for(var
split = "hasOwnProperty.isPrototypeOf.propertyIsEnumerable.toLocaleString.toString.valueOf".split(".");
key = split.shift();
)
if(Object.hasOwnProperty(key))
prototype.call(self);
//*/
return self
}
}(
function(){},
function(parent, extend, Function){
return function(){
this.parent = extend;
var result = Function.apply(this, arguments);
this.parent = parent;
return result
}
}
)
};



The Benchmark

I had to choose Prototype or MooTools, since these frameworks seem to have some problem to coexist. Since Prototype has been tested in the Ajaxian post, I decided to put MooTools in the middle.
Something to consider before you try the benchmark page ... I removed the purposeless direct method call since it is absolutely the same for every basic class, created via library or not.
The order is by performances, where generally speaking are these:

  1. Ad Hoc Manual Explicit Inheritance

  2. YUI lang.extend (close to Ad Hoc)

  3. wr.extend - WebReflection proposal (closer to jClass than YUI)

  4. jClass (truly close to base2)

  5. base2 (truly close to jClass)

  6. dojo

  7. Prototype

  8. MooTools


Try out The Benchmark

And have a nice week end ;-)

Saturday, February 14, 2009

Levenshtein algorithm revisited: 2.5 times faster

First of all one consideration:

JavaScript is SLOW!


Chrome, Safari, Opera, FireFox (Internet Explorer obviously is the only one 2 to 8 times slower!)
It does not matter which tracemonkey or V8 you are using, JS is far away to be fast as C#, C++, C are!

About Levenshtein


If you would like to create a project like BJSpell with a reasonable suggestion, the levenshtein algorithm will trill at some point: it calculates the distance between 1 string to another one. It means that from the word "Gofy" to the word "Goofy", this algo will return 1 character to change replace, or add, before Gofy will become Goofy.

Why Levenshtein


To obtain a suggestion that makes sense, we could use some known algorithm to recognize at least words close to the mistyped one. So far, my suggestion in BJSpell was something truly rude and often useless, but as far as I know there is NO WAY to implement something more clever, using every possible best performances practice.

How to speed up the algorithm


Every "bloody" implementation of this algorithm uses 1 more useless loop to pre-assign values to the multidimensional Array, plus a boring if else instead of ternary operator.
Removing those loops I switched performances from 1.680 seconds to 560, over 100000 computations using C#. The trick is simple: set the Array when you need it!
Here is the code I wrote for JavaScript first and C# after:

// JavaScript
var levenshtein = function(min, split){
// Levenshtein Algorithm Revisited - WebReflection
try{split=!("0")[0]}catch(i){split=true};
return function(a, b){
if(a == b)return 0;
if(!a.length || !b.length)return b.length || a.length;
if(split){a = a.split("");b = b.split("")};
var len1 = a.length + 1,
len2 = b.length + 1,
I = 0,
i = 0,
d = [[0]],
c, j, J;
while(++i < len2)
d[0][i] = i;
i = 0;
while(++i < len1){
J = j = 0;
c = a[I];
d[i] = [i];
while(++j < len2){
d[i][j] = min(d[I][j] + 1, d[i][J] + 1, d[I][J] + (c != b[J]));
++J;
};
++I;
};
return d[len1 - 1][len2 - 1];
}
}(Math.min, false);


// C#
public int levenshtein(string a, string b){
// Levenshtein Algorithm Revisited - WebReflection
if (a == b)
return 0;
if (a.Length == 0 || b.Length == 0)
return a.Length == 0 ? b.Length : a.Length;
int len1 = a.Length + 1,
len2 = b.Length + 1,
I = 0,
i = 0,
c, j, J;
int[,] d = new int[len1, len2];
while(i < len2)
d[0, i] = i++;
i = 0;
while(++i < len1){
J = j = 0;
c = a[I];
d[i, 0] = i;
while(++j < len2){
d[i, j] = Math.Min(Math.Min(d[I, j] + 1, d[i, J] + 1), d[I, J] + (c == b[J] ? 0 : 1));
++J;
};
++I;
};
return d[len1 - 1, len2 - 1];
}


As Summary


Levenshtein algorithm is still too slow if performed via JavaScript but for small dictionaries (up to 5000 words instead of 70000 or more as I tested with the entire en_US vocabulary) is a perfect choice for suggestions.

Have fun with algorithms ... next one? Who knows, probably the soundex :D

Friday, February 13, 2009

After the Array, subclassed String

As I wrote months ago in this documentation about JavaScript prototypal inheritance, subclass native data type is not that simple (almost not possible at all) but since JS is loads of weird bits and bops, I often try to break its documented limits.

It was the Array, some time ago, now it is about time for String.

The concept is similar, if you subclass a native data type its methods will return that native data type like instance, so concat, charAt, toString, etc etc, will return a String unless we do not override every inherited method (so performances wont be that good).

Subclassed String



(function(Function, slice, push){
// from WebReflection: Subclassed String
function String(String){
if(arguments.length) // clever constructor, accepts more than a string as argument
push.apply(this, slice.call(arguments).join("").split(""));
};
String.prototype = new Function;
try{
(new String) + ""; // exception in FireFox
var join = Array.prototype.join;
String.prototype.toString = String.prototype.valueOf = function(){
return join.call(this, "");
};
}catch(e){ // no way to retrieve the length with FireFox
String.prototype.toString = String.prototype.valueOf = function(){
for(var Array = [], i = 0; Array[i] = this[i]; i++);
return Array.join("");
};
};
(window.$ || ($ = {})).String = String; // let's put this in a namespace
})(String, Array.prototype.slice, Array.prototype.push);

That's it, every String.prototype.method should act as it has been called via native String with probably every browser.

Performances


concat, charAt, charCodeAt, toLowerCase, etc, etc will be almost the same of native strings but the constructor will be a bit slower (instances instead of regular "strings").
On the other hand, in every browser the returned instance will be an Array like String, so ...

new $.String("Here", " ", "we", " ", "are!")[0];

will be exactly the char "H" in every compatible browser.

Alternatives?


To create a valid alternative that will NOT be an instanceof String we could use the same Stack trick:

// every browser
(function(Function, join, push){
function String(String){
if(arguments.length)
push.apply(this, slice.call(arguments).join("").split(""));
};
String.prototype.length = 0;
String.prototype.valueOf = String.prototype.toString = function(){
return join.call(this, "");
};
// here we can add every String prototype to the $.String prototype
(window.$ || ($ = {})).String = String;
})(String, Array.prototype.join, Array.prototype.slice, Array.prototype.push);


Have fun ;-)

Tuesday, February 10, 2009

About isNative, is wasting time?

kangax as reply to Diego comment, then Ajaxian as desert with my comments, but three of us completely in the wrong direction.

Why isNative?


As I though and as Diego Perini confirmed, sometimes we would like to trust the environment itself before we act in a certain way.
I wrote about an XMLHttpRequest wrapper for Internet Explorer, but cases are numerous, like a "not compliant" implementation of the Array.prototype.forEach, or others proto, or some injection problems about window.Array, window.String, or who knows what else.
Accordingly, since the JavaScript nature is dynamic, widely dynamic, we would like to be sure that a method is as "specs says" rather than how this or that library implemented it.

Can we trust an unreliable environment?


The first error we all committed was to find the "best way ever" to understand if a function/method has been changed, without considering the fact that the function isNative itself could be a fake!
No matter how good is our way to discover the native code because isNative(this, "isNative"); will ironically return false every time.
A way to partially solve the problem is the usage of const but this keyword makes sense only for FireFox and Safari, so Opera and even worse case Internet Explorer will not be able to create an immutable function.

/*@cc_on var // @*/ const
isNative = function(){
// whatever You need
};

Above snippet, described in one of my old posts, will let us create an immutable function only for certain browsers, but the isNative problem, as I said, will persist for IE, Opera, and probably other browsers as well.

About proposed solutions, plus mine


Both kangax and Diego have written a couple of functions, the first one created a sandbox via iframe while the second one a string sniff.
Both solutions are not that perfect because of iframes or because a function toString method coul be changed arbitrary, since functions are objects as well.
My solution has probably some side effect as well, but unless smebody will be able to demonstrate that it could be hacked, which was for Diego case, I will consider my proposal the most reliable without iframes.

function isNative(object, method){
var r = !!object && !/undefined|string/.test(typeof object[method]);
if(r){
var toString = (method = object[method]).toString;
try{ // necessary for native functions (alert, as example)
delete method.toString;
r = method.toString !== Object.prototype.toString && /\[.+?\]/.test(method);
if(method.toString != toString)
method.toString = toString;
}catch(e){
r = true;
}
};
return r;
};

Come on, break it and demonstrate that as I said, JavaScript is 100% unreliable programming language ;-)

P.S. this discussion remind me those days me and kentaromiura were trying to find a way to solve eval injections, we gave up after a week of tries hacking our code by our own :D

Monday, February 09, 2009

JavaScript, Bear in Mind DOM Methods

We all like the jQuery code style where the instance is returned almost everywhere, but we often forget about DOM methods that do the same and ages before whatever library. I rarely spot some "returned instance" trick in daily posts or even famous libraries.
Here a couple of example:

// daily way ...
var div = document.createElement("div");
document.body.appendChild(div);

// easy way
var div = document.body.appendChild(document.createElement("div"));


// boring way
var div = document.createElement("div");
div.appendChild(document.createTextNode("test"));
document.body.appendChild(div);

// funny way
var div = document.body.appendChild(
document.createElement("div")
).appendChild(
document.createTextNode("test")
).parentNode
;

... on and on, there are several methods used on daily basis that do not consider returned values. So, why don't we like to use them?

var el = document.createElement("div"),
div = document.createElement("div");
alert([
el === document.documentElement.appendChild(el), // el in the DOM
el === el.parentNode.removeChild(el), // el removed
div === document.documentElement.appendChild(div), // div in the DOM
div === div.parentNode.replaceChild(el, div), // div removed, el in the DOM
div === el.parentNode.insertBefore(div, el) // both in the DOM, div returned
].join("\n"));

As summary, try to find a leak possibility with this:

with((document.body || document.documentElement)
.appendChild(
document.createElement("div")
)
.appendChild(
document.createTextNode("not")
)) parentNode
.insertBefore(
document.createTextNode("Why "),
parentNode.firstChild
)
.parentNode
.appendChild(
document.createTextNode(" ?")
)
;

Not even a single variable declared, three elements inserted in a specific node of the DOM and in a different order :D

Sunday, February 08, 2009

Advertisement revolution? A first try for Android, iPhone, and PCs

Introduction


Nowadays, we read news about new libraries, technologies, etc etc, but there is an area in this Web 2.0 era which seems to be "stuck" since big G introduced AdSense: a new way to deliver advertisement, compatible with new devices such as Google Android and Apple iPhone.
Even if Google proposes AdSense for mobiles as well, we know that those simple text boxes do not cope properly with Android and iPhone innovative web browsers.
This is why I have though about a new way to show ads, unobtrusive, cross-browser, and in a way even funny, with less than 2Kb minified and gzipped code, library independent, with everything inside its own closure ;-)

Ad 2.0 Introduction


The iframe below perfectly represents my idea of mobile advertisement, idea that I will explain better a bit further down. Every 10 seconds, a little banner comes up at the bottom of the screen, and it persists for 2 seconds before it disappears. The number of bunners can be from 1 till unlimited, in this case, just 3.

The page above has been written by me and inspired by x.facebook layout and colors, one of the best mobile implementations I have ever seen so far.
You can surf the content of the above iframe with almost every XHTML 1.0 Compliant browser, included FireFox, Safari, Chrome, Internet Explorer, etc etc. but it is ideal for Google Android or Apple iPhone, whatever mode you are using (portrait or landscape) and with a fixed bottom ad that comes up every N seconds and shown for other N seconds, parameters that I could easily customize directly via url, as well as colors and banners (now simply fakes).

A new idea? Only for mobile!


What we miss so far for mobiles are CSS "position:fixed" in iPhone, solved via my ad 2.0 code, and an advertisement system able to appear and disappear with transparency effects and in a Web 2.0 Stylish way. After a couple of hours of neck-breaking, here I am with a football match alike advertisement, something that makes sense in Web 2.0 since the tendency leans towards not refreshing a page in order to rotate displayed banners (concept I introduced ages ago in packed.it project page, a site entirely based on Ajax and where AdSense content is not static).

So basically, since even Facebook with their Ads policy did not introduce a nice way to show some "interesting stuff", I asked myself: what could be the best way to introduce ads without annoying people, without being obtrusive with whatever library, easy to click on touch-screens, and without boring long texts/big spaces since in this mobile era pixels are truly precious?
The fixed bar is entirely a link, it is extremely easy to click with both iPhone and Android, while the advertisement concept is just this: can you sell something with a 32x32 icon plus a summary or a funny sentence able to capture mobile surfers attention? Then in this case, Ad 2.0 does half the work, thanks to its appearing/disappearing effect plus transparency and customizable colors, and the rest is up to your marketing ability ;-)

Here there's a static shot of Ad 2.0 script:


I know a lot of you are thinking: OMG! Advertisement even in my mobile? You are evil! ... well, it is just a simple idea but I am sure we will have ads in our mobile sites soon - so, in my opinion, it is just a matter of time, and the way to show it.

Any comment or suggestion will be more than appreciated, have a nice week.

P.S. Does anybody know how to contact some guy in Facebook?

Friday, February 06, 2009

Another experiment inspired by Google Buttons

I found this post from Ajaxian extremely interesting, and I decided to try to create a better version, if it is possible, of Google Buttons 3.1

The result is not perfect, still alpha, and just for fun, but compatibility seems to be great while multiple buttons visualizations (horizontal or vertical) is extremely simple: write links beside each other, or add a couple of rules for vertical version.



States are hover, active, selected, represented via JavaScript events (onmouseover, onmousedown, onmouseup).
As I said this is just an experiment, but I think as is it is not that bad ... maybeI am wrong :D

Thursday, February 05, 2009

Hunspell for JavaScript ? A quick and dirty one: BJSpell

Somebody asked me to implement a web based spell check without plug-ins or specific browsers requirement (we all know banks security policies, for example, every plug-in could be dangerous but maybe they are still using the most un-secure browser still available these days: Internet Explorer 6 ... )

My answer, straight on: Guys, there is nothing available in pure JavaScript, but I can investigate about it

That's it, a couple of searches via Google, and nothing as expected ... but hey, wait a minute: what's about FireFox spell checker plug in and Open Office project? Here we are, Hunspell is the answer!



BJSpell Project


In a full day of self brain storming, history background cases study, and code, plus debug and tests, I created BJSpell, a basic, fast, cross browser, and runtime implementation capable to underline bad words using a pre compiled Hunspell compatible dictionary (one or more).
The reason I avoided the usage of the classic "click here for the spell checker" is server stress, since the idea was to bring the spell check inside text-areas without polling with short of big texts the server and every N seconds or, the worst case scenario, on every "onkeyup" textarea event!

Pure JavaScript


To create the little speller I used JavaScript on both client and server side. In the project page you can browse the trunk branch and have a look to BJSpell.Jaxer.html page, based on Jaxer (ok, ok ... I know, just another excuse to test my PAMPA-J project ... )

A real world example page


I created a folder in my website with a BJSpell demonstration page ... wait for the cursor in the right area and try, for example, to write something like "... let's try some weirf word ..."

If you have FireFox spell check plugin with english dictionary, you will difficulty spot the difference between the left textarea and the produced output in the right div ... but what the plug-in does not do, is to show you a couple of alternatives, or close words, instead of the wrong one

What's next?


The project is in beta status, so I need your help to debug it and to understand if it has been worthy and if it is truly a good compromise between performances and result. The suggestion is not perfect, truly quick and dirty, just a basic and incomplete implementation, so we will see how and if the project will go on ;-)

Monday, February 02, 2009

PHP static as virtual self, what else for parent?

I find the static keyword in PHP 5.3 absolutely useful, but I instantly felt into a dilemma: what about parent?

With static it is possible to refer to a generic class static property, without referring the class where the method has been declared.
Here there is a simple example:

class Numbers {
public static $value = 0;
public static function getValue(){
return static::$value;
}
}

class One extends Number {
public static $value = 1;
}

class Two extends Number {
public static $value = 2;
}

$one = new One;
$two = new Two;
$one->getValue(); // 1
$two->getValue(); // 2

Above example shows how static behaves when an instance calls a method which uses static keyword.
Whatever subclass it is, we will always obtain the defined value (if any, otherwise the inherited one) as expected.

But what about parent behavior?

If we use parent inside a method it is like using self, the parent keyword will always refers to the parent class, if any, of the method which contains the parent keyword. If we add a function like this in the Numbers class:

public static function getParentValue(){
return parent::$value;
}

every subclass wont be able to show the Numbers $value and an error will be generated instead.

To obtain the desired result we have to use the ReflectionClass:

// class Numbers ...
public static function getParentValue(){
$class = new ReflectionClass($this);
return $class->getParentClass()->getStaticPropertyValue('$value');
}

Above code is much slower than parent::$value because of the instance creation plus its method call for each ReflectionClass instance ($this plus the one returned by getParentClass)

Now, the question is: is it truly necessary? Will we have other ways to retrieve a "static parent" from a method?