Wednesday, June 12, 2013

On Harmony JavaScript Generators

Developers get easily excited when something so used, acclaimed, and desired in another land comes to their own ... or the one they think they own ...
This is the case of ECMAScript 6 Harmony Generators, something at this time you need to activate through the --harmony flag in node or going to about:flags in Google Chrome Canary url bar and enable experimental harmony/extension.
Once you've done that, you'll be able, still through Chrome Canary at this time, to test examples, benchmarks, and other things in this post ... ready? So here the first great news:

Generators Are Slower

At least two times slower than forEach and about 10 times slower than regular loops.
This is the result showed in this jsperf benchmark you can run too.
Of course generators are slower, there's a massive process behind each generator such:
{
  code: 'the generator function body',
  context: 'the bound/trapped context',
  scope: 'the whole scope used by the generator',
  handler: 'to perform iterations',
  state: 'the current generator private state',
  // inherited from GeneratorConstructorPrototype
  send: 'the method to address values',
  throw: 'the method to throw Errors',
  next: 'the method to keep looping'
}
// plus every 'next' or 'send' call
// will return a fresh new object
{
  value: 'the current value at this time',
  done: 'the boolean value that helps with iterations'
  // when done is true, an extra call
  // to .next() will throw an error!
}

Not Only Slower ...

The fact every interaction with a single generator creates N amount of objects means that the garbage collector will work more than necessary and the RAM will be easily saturated whenever your server does not have such big amount of it ... and cheap hosts are still the most common choice plus if the program/language is greedy, why should you spend more in hosting hardware? You should not, as easy as that.

... But Feel Free To Use Them

If you believe generators can help anything in your logic, infrastructure, system, and you don't need best performance for that situation go with generators. These have been used in Mozilla internals for a while since there when Firefox was version 3 or even lower, can you believe it?
So, if these worked before becoming part of a standard and before hardware was as good as it is today, there must be use cases where generators are a better choice ... right?

JavaScript Never Needed To Sleep !!!

Unfortunately, a part for some academic Fibonacci exercise or even worst some sleep(delay) example, there's no much more you'll find about how cools are generators in JS .. simply because JavaScript never really needed them, being an Event handler oriented/prone programming language, where events always worked even better than generators for other languages since events can be triggered at any point, not just in a synchronous "top-to-bottom" flow.

Coming From Mars

One common problem in JS is that every new comer would like to find what is missing the most from her/his own old programming language ...
  • PHP developers never complained about missing types, they'll rarely get how prototype inheritance works there though
  • Java developers complains about missing types ... they'll try to use the JS flexibility to make it as similar as Java as possible understanding inheritance slightly better than PHP devs and abusing closures by all means to make it as super() compatible as possible 'cause ParentClass.call(this); inside ChildClass constructor freaks them out
  • C# developers think they have all the best there ... forgetting C# is not statically compilable and it is derived from ECMAScript 4th Edition, almost 2 editions before current JavaScript specification ^_^
  • C++ developers will propose new optimized Virtual Machines every day and most likely will probably never use JS ... still they will decide how JS developers should use JS regardless
  • Python and Ruby developers will just laugh about all JS shenanigans thinking their favorite language has none of them or worst
Well, here the thing ... generators and yield keyword are really old concept from languages that have not being created to work asynchronously as JS does, included all those mentioned in above list.
That's why I believe generators aim is being misunderstood from JS community ... and once again, feel free to use them as much as you want, but please keep reading too, thanks!

Queuing The Delay

if you start waiting for events after other events in a generator way:
var file1 = yield readingFile('one'),
    file2 = yield readingFile('two'),
    combined = file1.value + file2.value;
Here the bad news: that won't work magically as you expect!
// a magic function with many yields ...
function* gottaCatchEmAll(fileN) {
  for (var i = 0; i < arguments.length; i++) {
    yield arguments[i];
  }
}

// a magic expected behavior that won't work
// as many might expect ...
var content = gottaCatchEmAll(
  'file1.txt',
  'file2.txt'
);
Until we call content.next(), we eventually store the object value if no error has been threw and the done property is false, no parallel file loading will be performed by all means!
That's correct, what node.js elegantly solved with what JS was offering already, is screwed again with this new approach that won't block and won't execute at the same time.

Still Room For New Users

The controversial part about generators is that these might be useful to synchronize sequential, inevitably delayed or dependent executions while still non blocking other handlers ... well, here a couple of thoughts:
  1. try to make a generator behave as you expect ... seriously!
  2. try to learn how to use a queue instead
Not kidding, the second part is much easier than expected plus is a Promise like approach compatible with every environment and it fits in a tweet.
function Queue(a,b){
setTimeout(a.next=function(){
return(b=a.shift())?!!b(a,arguments)||!0:!1
},0);
return a}

How Does That Work?

I've tried to explain that in details in this working with queues blog post and at the same time I have written a slightly improved queue so that arguments can be passed between callbacks.
var fs = require('fs');
var q = Queue([
  function onRead(queue, args){
    if (args) {
      // add result to the content
      queue.content.push(args[1]);
      // if there was an error ...
      if (args[0]) {
        // attach it to the queue object
        queue.error = args[0];
      }
    } else {
      // first time execution
      queue.content = [];
    }
    // if there's anything to read
    if (queue.files.length) {
      // add "priority queue" to itself
      queue.unshift(onRead);
      // so that once done ...
      fs.readFile(
        // ... reducing the number of files to read
        queue.files.shift(),
        // ... will be re-executed
        queue.next
      );
    } else {
      // simply fire the end of this thing
      queue.next();
    }
  },
  function theEnd(queue) {
    // if there was an error ...
    if (queue.error) {
      // throw it or do whatever!
      throw queue.error;
    }
    // otherwise simply show results
    console.log(queue.content.join(''));
  }
]);

// files to load
q.files = [
  'file1.txt',
  '/user/attempt/file2.txt'
];

OH Come On What Is That

If you think dealing with generators is easier and the real effort behind the yield keyword is less verbose than above abstract example over a single use case, I am here waiting for your link to show me the ease, the cross version/platform compatibility, the performance (and I am not talking about your latest MacBook Air device but hardware Raspberry-Pi like which is suitable and already used as a web server) of your generator based solution willing to reconsider my point of view and change some module in order to switch, even if not needed, to this new approach.
Right now I see this new entry as completely overrated, able to bring fragmentation between node.js and the Web, and unable to concretely simplify or solve parallel asynchronous operations as elegantly as events would do through emitters.
Thanks for your effort reading 'till the end.
Some comment outside this blog:
  • Alex Russel on performance, and my reply which is: bound functions are still slow. I am not expecting generators to be faster than bound functions at any time in the near future

2 comments:

Thanasis Polychronakis said...

thank you for this analysis.

It is always rewarding to read articles that are based on facts and not beliefs or trends.

fromdev.com said...

You are right, at times we have used generators since performance is not always biggest concern. Specially if you are not worried about billions of users.

Thanks for the interesting article.