Tuesday, July 15, 2014

A W3C Custom Elements Alternative

This W3C specification is awesome, and if you follow Alex Russell you know this has been probably his biggest obsession for years now: custom Web components, something even Internet Explorer 5.5 had, thanks to its Element Behaviors capability!

Many Articles, Poor Support :-(

Regardless the amount of articles about Web Components and a dedicated website for Custom Elements, browsers capable of supporting them natively are ... well, most likely none of them, specially on Mobile world.
This has at least 2 side effects:
  1. delayed adoption and/or interest from developers
  2. need for performant, not obtrusive, and reliable polyfills
Unfortunately, today there's no such thing ... like a plural of choices, and even Mozilla x-tag project, when it comes to Custom Elements, is based on the only polyfill you can find out there, the one proposed by Polymer framework ... however ...

Polymer Is NOT Standard .

Every developer I've talked about Custom Elements replied some how like "yeah, Polymer, I've heard about it ..." ... well, the bad news is that if you think about Polymer you have no idea about the proposed standard.
True is that the entire Web Components and Custom Elements "affair" comes from Google, and in the Custom Elements case, Dimitri Glazkov, a very nice dev that already helped me with few hints, is the main specs editor.

The Misleading Piece

Unfortunately most web articles about Web Components and Custom Elements end up talking or pointing at Polymer platform but this is indeed a platform a part, same as X-Tag, Ember or Angular would be, it's not the standard itself!
As result, most example you can read online are not exactly what W3C is proposing, rather what Polymer platform provides!
Also bear in mind, Custom Elements does not mean Web Components, the proposal is huge and we should rather learn more and be able to distinguish between these standards.

Custom Elements As In Polymer

The provided polyfill is a very nice piece of code, well documented, I believe well tested, but unfortunately incomplete.
As example, if you follow its building instructions you will end up requiring an extra MutationObservers polyfill that as first line assumes you are in ES6 and the WeakMap constructor is available.
The amount of code required to provide a proper WeakMap polyfill in engines where WeakMap is not supported results into an overbloated, probably unnecessary, and not so welcome entire new library to the plate ...
... and then there are web developers that actually care about code size and compatibility with not that old browsers too ... not willing to bring 2 third parts libraries in order to better support this brand new API ...

A Lightweight Alternative

During last week I've had a chance to experiment with some sort of cross domain Custom Element and I won't bother you with details since I'd like to share instead the library that only today I've managed to organize in a repository and test with all devices and browsers I could: let me introduce you my W3C Custom Elements Polyfill, and these are few features:
  • less than 2KB minified and gzipped without extra dependencies, you serve this file, you have it!
  • a wide range of old to modern mobile devices support ... iOS 5 and Android 2 are only few of them, IE9 for Windows Phone 7 made it too together with webOS 2!
  • focus on one task ... and while this sounds obvious, I don't think to use Custom Elements we need a proper, partial, cross browser fix for both MutationObservers and WeakMap ... but good news is, if you have already a patch for MutationObservers these will be used instead of old Mutation events API: it's a win-win!

Try It Yourself!

Not only you can test it directly, you can also read most famous articles and experiment just including the single file.

As usual, contributions are welcome, and I'll put your name in the MIT license too, no CLA required ;-)

Monday, June 30, 2014

On Meaningful Performance

This post is a complementary write up about my WebPerfDays talk given at Google HQ last Friday.

Slides And More

During the talk I've live demoed an Arduino Yun board and its performance and I've also showed through a documents camera live performance tests on a wide range of devices including:
  • Android 2.3
  • Bada OS
  • Blackberry 10
  • first ZTE FirefoxOS phone
  • Windows Phone 8
  • ... probably others too ...
While it's easy for me to point at those slides, you'll miss most of the live demoing content so here a better walk through.

Embedded Systems Performance

While Intel's Edison can be considered just a promise, there is already a huge variety of boards based on Atheros AR9331 board which is a single core MIPS architecture based System on a module @ 400MHz almost as small as the promised Edison.
Arduino Yun is only one of those boards featuring such beauty, and here something interesting about performance.

nodejs and npm work but npm is freaking slow

I've also opened a ticket about this non critical issue but the long story short is that npm --version or just npm --info take more than 7 seconds to show anything at all plus if you don't disable node flags npm will crash/fail to install anything with the global flag.
The lesson here in a nutshell: if your program is capable of smaller tasks, isolate these and make these executable/available a part instead of putting any software behavior after loading, parsing, and analyzing the entire logic. --version, as example, is not even something to even think about. Make basic operations available ASAP and lazy load complex operations and related logic only when needed.

nodejs FileSystem can go faster

It does not matter if node fs is asynchronous and non blocking, it still ensures somehow atomic reads and writes and its performance are usually based on assumptions that today Hard Drives have a lot of cache, are very fast ... etc etc ... then you face embedded:

SD cards are actually pretty fast but if you have concurrent reads you won't have any cache to take advantage.
Little tricks like those used in some fspeed experimental module might help to reach better concurrent performance without needing much more RAM or CPU power from the tiny board.

nody little server

The entire talk has been live demoed over my Arduino Yun and nodejs running nody, a tiny little server focused on doing one thing: serving (little) files if presents to as many clients as possible.
nody handled up to 20 devices at the same time, asking for same or different content, without a glitch.
The project with same name already exists in npm so I might think about a new name and make it a proper package ideal for embedded systems. It's KISS and YAGNI at its best so far :-)

Mobile Web Has Embedded Performance

It would be so simple if everyone had the latest iPhone or latest Android Hardware available, unfortunately the reality is way different and specially in emerging markets where cheap phones are the target to reach.

2010 Best Practice still valid

Everything I could do to make this map experiment move smooth even on Bada or Android 2 phones is still valid these days:
  • you might want to degrade to 30 FPS instead of 60 and go smoother in older HW
  • you don't want to cache all the things because you have a limited amount of RAM
  • you might even prefer canvas to draw tiles instead of CSS because canvas is a single surface to upload and draw, CSS squares can be many impacting FPS smoothness/linearity

Benchmark For Real

The Tesla Experiment is a good benchmark to see how good is the GPU, if used at all, the CPU that will calculate all lightning, and how many touch input we can have on a single screen. A cheap old Bada OS perform actually very well in there, why aren't we targeting these devices too?

Do Not Polyfill Old Hardware!

When you realize that simply touch or mouse move events are very heavy and triggered like 15 times per second in old Androids phones, you must be think again why on earth you would use touch events to simulate Microsoft Pointer Events, a kind of event that also won't bring us much on touch devices ...
Windows 8 and WP 8 Phones have a very good hardware and the cheapest WP8 phone you can try will trigger at 60FPS any simulated touch event ... deal with it Microsoft, we need to support old Android and other browsers with its old Hardware underneath, you are the only out there without Touch Events support ... please fix this!

The power of Touch events

Using just W3C Touch Events interface we can create interesting logic for our Web Apps like this 54 cards deck or some horizontal snapped scroll (the second one with the 2 inside, try to move it horizontally).
In order to create these little demoes:
  • ie-touch which is the simple drop-in able to make WP phones react to touch events
  • dom4 to bring common new DOM Level 4 entries tested and normalized for probably the widest variety of mobile phones and OS of the last 5 years
  • ScrollHandler to understand user gestures
  • SimpleKinetic to calculate asynchronously directions and movements through deltas

Performance That Matter For Real

As mentioned during my talk, if you are still using jsperf to benchmark ++i VS i++ you really have no idea what is the problem in your app ... focus on real bottlenecks and try to avoid holding on RAM all that data, all that DOM, all those images, etc etc ... use storages, cache callbacks instead of entire network or i/o results, find the problem and solve it reasonably.
Last, but not least, spend few bucks for some cheap second hand phone and use it for tests: if it will go reasonably well in that hardware, it will FLY in any other modern browser! Don't trust your emulator when it comes to real raw performance.

Get ready for the future: it will not necessarily be more performant, rather smaller, using less power, and everywhere, as the Internet Of Things is already these days!

Tuesday, May 20, 2014

134 bytes for an optimized and very basic jQuery like function

TL;DR this fits in a tweet :-)

What's New Here

The most tiny, common and probably used utility for quick prototyping is usually even smaller:
function $(S,P){return [].slice.call((P||document).querySelectorAll(S))}
but it's a performance killer for all cases where we use #selectors or even body since the engine inevitably needs to check all nodes in the DOM.

The Handy :first Pseudo Selector

Hard to believe W3C preferred two different but similar methods instead of simply adding a pseudo :first selector which is not standard and on top of it, it does not work as :first-child or any other selector ... the purpose would be to select only the first occurrence and stop the DOM parsing in an efficient way; instead we need to use querySelector, which is different from querySelectorAll.

A Consistent Result

Not only two different methods to use, we cannot operate in a similar way with those results anyway since one is collection like, where the length would be the indicator for empty results, while the singular version would be eventually null.

Well, Scratch That!

With desktop and mobile browsers released between 2008 and today we can do already many things using native APIs and here we have a probably not always perfect qSA that might fail using some crazy selector I personally try to avoid anyway, plus many Array extras that today are not extras anymore thanks to any sort of polyfill.
Good news is, we might want to polyfill only for jurassic browsers querySelector/All too ... so here what we can do:
// set all click actions per each link
  .forEach(function (link) {
    link.addEventListener('click', this);
  function click(e) {
    // prevent default
    // and show the link

// only first occurrence
$('.logout:first').forEach( ... action ... );

Going Wild Extending

The temptation to extend Array.prototype in order to simplify listeners handling is huge ... but while I am not telling you some library did it already, I'd like to show some trick to be less obtrusive and still super productive:
// simple utility to add an event listener
$.on = function (CSS, parentNode, type, handler, capture) {
  // in case parentNode is missing
  if (typeof type !== 'string') {
    capture = handler;
    handler = type;
    type = parentNode;
    parentNode = null;
  return $(CSS, parentNode).map(function (el) {
    el.addEventListener(type, handler, capture);
    return el;

// example
$.on('a', 'click', function (e) {
  // prevent default
  // and show the link

That's Pretty Much It

We have a tiny, yet performant utility, and many ways to make it better for what we need on a wide range of already supported browsers.

Friday, May 16, 2014

serving static files twice as fast in node.js

Having asynchronous I/O does not bring automatically "best performance ever" but it helps already.
However, regardless its amazing performance, node.js is also famous for not being a performant competitor when it comes to serving static files ... is there anything we can do?

Avoiding Multiple Reads

Every time we use fs.readFile we schedule a reading operation. This is non blocking and async, but what happens if two users ask for the same file in a fraction of time between the first read, and the second?
// simulating disk I/O
  // .. while reading


<< disk.read.shift();

  // .. while reading
<< disk.read.shift();
Wouldn't be better to have this kind of flow instead?
// simulating disk I/O
  // .. while reading


<< disk.read.shift();
We basically satisfied two read request to the same file at once.

Handled fs.readFile

During the refactoring of a personal project of mine, I've benchmarked the current piece of code:
//!@author Andrea Giammarchi
// @module file-read.js
// Concurrent I/O Handling
  // still needed, of course
  fs = require('fs'),
  // single shared module cache
  cache = Object.create(null)

// same fs.readFile API
this.readFile = function read(name, then) {
  // already requested, let I/O complete
  if (name in cache) {
    // and notify this callback after anyway
  } else {
    // first time somebody asked for this
    // create the collection of callbacks
    cache[name] = [then];
    // perform the I/O operation 
    fs.readFile(name, function readFile(err, data) {
        // grab all callbacks waiting for a result
        list = cache[name],
        // loop over such list
        i = 0,
        length = list.length
      // after erasing it
      delete cache[name];
      while (i < length) {
        // ditch a boring/slow try/catch
        // and notify all callbacks ^_^
        list[i++](err, data);
The code used to benchmark is basically this one:
// bench.js
for (var
  // require another file or 'fs' by default
  fs = require(process.argv[2] || 'fs'),
  i = 0,
  // start time
  t = Date.now();
  // 25000 simultaneous requests
  i < 25000; i++
) {
  fs.readFile(__filename, function () {
    // end time if all requests have been satisfied
    // will be async anyway ^_^
    if (!--i) console.log(Date.now() - t);
Here results on my MacBook Pro (lower is better):
  1. node bench.js fs: 505ms
  2. node bench.js ./read-file.js: 19ms

About 25X Faster On Concurrent Requestes!

Which I believe is a good achievement when it comes to static files serving since these files usually don't change often so there's actually no need at all to require them more than once per group of users ...
The best part of it? That an array creation compared to I/O operations means almost nothing for V8 so that even if the file is required only once per time and no concurrent connections are in place, the overall performance will be the same, at least in my Mac, and around 9ms per request.